Exploring Individual Factors in the Adoption of LLMs for Specific Software Engineering Purposes
Pith reviewed 2026-05-22 21:44 UTC · model grok-4.3
The pith
Software engineers adopt LLMs for different SE tasks based on distinct sets of individual factors, with some factors reducing adoption when examined alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
This study surveyed 188 software engineers and applied the Unified Theory of Acceptance and Use of Technology (UTAUT2) through structural equation modeling to identify the individual factors influencing LLM adoption for five distinct software engineering purposes. The results show that each purpose is affected by unique combinations of factors, and that certain factors negatively influence adoption when analyzed in isolation, highlighting the intricate nature of integrating LLMs into specific SE activities.
What carries the argument
UTAUT2 constructs tested via structural equation modeling on survey responses for each of five separate SE purposes.
If this is right
- Tool developers can design purpose-tuned LLM features instead of one-size-fits-all agents.
- Team leaders can apply different encouragement tactics for artifact generation than for decision-making support.
- Adoption models must treat factors in combination rather than in isolation to avoid misleading negative signals.
- Strategies for LLM integration need to be workflow-specific rather than organization-wide.
Where Pith is reading between the lines
- General LLM training programs may show uneven results because the same factor can help one task and hinder another.
- Objective usage metrics collected over time could test whether the reported negative effects persist beyond initial intentions.
- The pattern may extend to other AI tools in engineering, suggesting purpose-specific studies rather than broad adoption surveys.
Load-bearing premise
Self-reported survey answers from 188 engineers accurately capture the causal links between UTAUT2 factors and real usage behavior for each purpose rather than just intentions or social bias.
What would settle it
A follow-up study that replaces self-reports with logged usage data from the same engineers and finds no differences in which factors predict use across the five purposes would undermine the claim.
Figures
read the original abstract
Context: The advent of Large Language Models (LLMs) is transforming software development, significantly enhancing software engineering (SE) processes. Research has explored their role within development teams, focusing on the specific purposes for which LLMs are used within SE tasks, such as artifact generation, decision-making support, and information retrieval. Despite the growing body of work on LLMs in SE, most studies have centered on broad adoption trends, neglecting the nuanced relationship between individual cognitive and behavioral factors and their impact on purpose-specific adoption. While factors such as perceived effort and performance expectancy have been explored at a general level, their influence on distinct SE purposes remains underexamined. This gap hinders the development of tailored LLM-based systems (e.g., Generative AI Agents) that align with engineers' specific needs and limits the ability of team leaders to devise effective strategies for fostering LLM adoption in targeted workflows. Objectives: For the reasons mentioned above, this study aims to study the individual factors that drive the choice to use LLMs for distinct SE purposes. Methods: To achieve the above-mentioned objective, we surveyed 188 software engineers to test the relationship between individual attributes related to technology adoption and LLM adoption across five key purposes, using structural equation modeling (SEM). The Unified Theory of Acceptance and Use of Technology (UTAUT2) was applied to characterize individual adoption behaviors. Results: The findings reveal that purpose-specific adoption is influenced by distinct factors, some of which negatively impact adoption when considered in isolation, underscoring the complexity of LLM integration in SE.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys 188 software engineers and applies structural equation modeling (SEM) with the UTAUT2 framework to test how individual factors relate to LLM adoption across five specific SE purposes (artifact generation, decision-making support, information retrieval, and two others). The central claim is that each purpose is influenced by a distinct set of UTAUT2 constructs, with some constructs exerting negative effects on adoption when examined in isolation.
Significance. If the SEM results prove robust after proper validation, the work advances technology-acceptance research in SE by shifting from broad adoption studies to purpose-specific analysis. This could inform the design of tailored generative-AI agents and help team leaders target interventions. The empirical survey-plus-SEM design is conventional for UTAUT2 studies, but its value hinges on transparent reporting of model diagnostics and explicit limits on causal inference.
major comments (2)
- [Methods] Methods section: The abstract and available description state that SEM was performed but supply no model-fit statistics (CFI, RMSEA, SRMR), no information on missing-data handling, no multicollinearity diagnostics, and no indication whether the five purpose-specific models were estimated separately or jointly. These omissions prevent verification that the reported path coefficients support the claim of distinct, including negative, effects.
- [Results] Results section: The claim that certain UTAUT2 constructs 'negatively impact adoption when considered in isolation' is presented as a substantive finding. With only cross-sectional self-report data, the paths capture associations among survey items; the manuscript does not report objective usage logs, longitudinal follow-up, or multi-source validation that would be required to treat the coefficients as causal influences on actual behavior.
minor comments (1)
- [Introduction] The manuscript should explicitly cite the original UTAUT2 reference (Venkatesh et al., 2012) when describing the constructs and should clarify whether any items were adapted for the LLM context.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help improve the clarity and rigor of our manuscript. Below we respond point-by-point to the major comments, indicating the revisions we will undertake.
read point-by-point responses
-
Referee: [Methods] Methods section: The abstract and available description state that SEM was performed but supply no model-fit statistics (CFI, RMSEA, SRMR), no information on missing-data handling, no multicollinearity diagnostics, and no indication whether the five purpose-specific models were estimated separately or jointly. These omissions prevent verification that the reported path coefficients support the claim of distinct, including negative, effects.
Authors: We agree that these methodological details are essential for transparency. In the revised manuscript we will report the model-fit indices (CFI, RMSEA, SRMR) for each of the five models, describe missing-data handling (complete cases were used after listwise deletion of the small number of incomplete responses), provide multicollinearity diagnostics (VIF values for all predictors), and explicitly state that the five purpose-specific models were estimated separately rather than jointly. These additions will enable readers to evaluate the reported coefficients. revision: yes
-
Referee: [Results] Results section: The claim that certain UTAUT2 constructs 'negatively impact adoption when considered in isolation' is presented as a substantive finding. With only cross-sectional self-report data, the paths capture associations among survey items; the manuscript does not report objective usage logs, longitudinal follow-up, or multi-source validation that would be required to treat the coefficients as causal influences on actual behavior.
Authors: We accept that the cross-sectional, self-reported nature of the data means the SEM paths reflect associations rather than causal effects. We will revise the Results and Discussion sections to replace causal language such as 'negatively impact' with 'are negatively associated with' and will add an explicit limitations paragraph noting the absence of objective usage logs, longitudinal data, or multi-source validation. These changes will better frame the findings as correlational insights within the UTAUT2 framework while preserving the observation that certain constructs show negative coefficients when examined in isolation. revision: partial
Circularity Check
No circularity: empirical SEM on external survey data
full rationale
The paper reports an empirical study that surveys 188 software engineers and applies structural equation modeling (SEM) using the established UTAUT2 framework to identify associations between individual factors and purpose-specific LLM adoption. Results are obtained by fitting the model to collected survey responses; no mathematical derivation, first-principles prediction, or parameter fitted in one step is then relabeled as an independent prediction in another. The abstract and context contain no self-citation load-bearing steps, no uniqueness theorems imported from the authors' prior work, and no ansatz smuggled via citation. The chain is self-contained against standard external benchmarks for survey-based SEM research.
Axiom & Free-Parameter Ledger
free parameters (1)
- SEM path coefficients
axioms (2)
- domain assumption UTAUT2 constructs are appropriate and sufficient to characterize individual LLM adoption behavior in software engineering contexts.
- domain assumption Self-reported survey data can be treated as a reliable proxy for actual usage behavior across the five purposes.
Forward citations
Cited by 1 Pith paper
-
To Copilot and Beyond: 22 AI Systems Developers Want Built
Survey of 860 developers reveals 22 desired AI systems for non-coding tasks with explicit constraints on authority, provenance, and quality signals, framed as bounded delegation where AI handles assembly work but not ...
Reference graph
Works this paper leans on
-
[1]
An empirical study of the code generation of safety-critical software using llms,
M. Liu, J. Wang, T. Lin, Q. Ma, Z. Fang, and Y . Wu, “An empirical study of the code generation of safety-critical software using llms,” Applied Sciences, vol. 14, no. 3, p. 1046, 2024
work page 2024
-
[2]
Exploring large language models for code explanation,
P. Bhattacharya, M. Chakraborty, K. N. S. N. Palepu, V . Pandey, I. Dindorkar, R. Rajpurohit, and R. Gupta, “Exploring large language models for code explanation,” ArXiv, vol. abs/2310.16673, 2023
-
[3]
Self-planning code generation with large language models,
X. Jiang, Y . Dong, L. Wang, Q. Shang, and G. Li, “Self-planning code generation with large language models,” ACM Transactions on Software Engineering and Methodology , 2023
work page 2023
-
[4]
Self-collaboration code generation via chatgpt,
Y . Dong, X. Jiang, Z. Jin, and G. Li, “Self-collaboration code generation via chatgpt,” ACM Transactions on Software Engineering and Method- ology, 2023
work page 2023
-
[5]
A survey on large language model (llm) security and privacy: The good, the bad, and the ugly,
Y . Yao, J. Duan, K. Xu, Y . Cai, Z. Sun, and Y . Zhang, “A survey on large language model (llm) security and privacy: The good, the bad, and the ugly,” High-Confidence Computing, p. 100211, 2024
work page 2024
-
[6]
S. I. Ross, F. Martinez, S. Houde, M. Muller, and J. D. Weisz, “The programmer’s assistant: Conversational interaction with a large language model for software development,” in Proceedings of the 28th International Conference on Intelligent User Interfaces , 2023, pp. 491– 514
work page 2023
-
[7]
J. Kumar and S. Chimalakonda, “Code summarization without direct ac- cess to code-towards exploring federated llms for software engineering,” in Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering , 2024, pp. 100–109
work page 2024
-
[8]
An empirical evaluation of using large language models for automated unit test generation,
M. Sch ¨afer, S. Nadi, A. Eghbali, and F. Tip, “An empirical evaluation of using large language models for automated unit test generation,” IEEE Transactions on Software Engineering , vol. 50, pp. 85–105, 2023
work page 2023
-
[9]
Chatgpt vs sbst: A comparative assessment of unit test suite generation,
Y . Tang, Z. Liu, Z. Zhou, and X. Luo, “Chatgpt vs sbst: A comparative assessment of unit test suite generation,” IEEE Transactions on Software Engineering, vol. 50, pp. 1340–1359, 2023
work page 2023
-
[10]
Software testing with large language models: Survey, landscape, and vision,
J. Wang, Y . Huang, C. Chen, Z. Liu, S. Wang, and Q. Wang, “Software testing with large language models: Survey, landscape, and vision,”IEEE Transactions on Software Engineering , vol. 50, pp. 911–936, 2023. 12
work page 2023
-
[11]
Beyond code generation: An observational study of chatgpt usage in software engineering practice,
R. Khojah, M. Mohamad, P. Leitner, and F. G. de Oliveira Neto, “Beyond code generation: An observational study of chatgpt usage in software engineering practice,”Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 1819–1840, 2024
work page 2024
-
[12]
Navigating the complexity of generative ai adoption in software engineering,
D. Russo, “Navigating the complexity of generative ai adoption in software engineering,” ACM Transactions on Software Engineering and Methodology, vol. 33, no. 5, 2024
work page 2024
-
[13]
S. Lambiase, G. Catolino, F. Palomba, F. Ferrucci, and D. Russo, “Investigating the role of cultural values in adopting large language models for software engineering,” 2024. [Online]. Available: https: //arxiv.org/abs/2409.05055
-
[14]
What guides our choices? modeling developers’ trust and behavioral intentions towards genai,
R. Choudhuri, B. Trinkenreich, R. Pandita, E. Kalliamvakou, I. Steinmacher, M. Gerosa, C. Sanchez, and A. Sarma, “What guides our choices? modeling developers’ trust and behavioral intentions towards genai,” 2024. [Online]. Available: https://arxiv.org/abs/2409.04099
-
[15]
Building living software systems with generative & agentic ai,
J. White, “Building living software systems with generative & agentic ai,” arXiv preprint arXiv:2408.01768 , 2024
-
[16]
Generative artificial intelligence for software engineering–a research agenda,
A. Nguyen-Duc, B. Cabrero-Daniel, A. Przybylek, C. Arora, D. Khanna, T. Herda, U. Rafiq, J. Melegati, E. Guerra, K.-K. Kemell et al. , “Generative artificial intelligence for software engineering–a research agenda,” arXiv preprint arXiv:2310.18648 , 2023
-
[17]
V . Venkatesh, J. Y . L. Thong, and X. Xu, “Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology,” MIS Quarterly , vol. 36, no. 1, pp. 157–178,
-
[18]
Available: http://www.jstor.org/stable/41410412
[Online]. Available: http://www.jstor.org/stable/41410412
-
[19]
Language mod- els are few-shot learners,
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language mod- els are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020
work page 1901
-
[20]
Grounded copilot: How programmers interact with code-generating models,
S. Barke, M. B. James, and N. Polikarpova, “Grounded copilot: How programmers interact with code-generating models,” Proceedings of the ACM on Programming Languages , vol. 7, no. OOPSLA1, pp. 85–111, 2023
work page 2023
-
[21]
A. Agossah, F. Krupa, M. Perreira Da Silva, and P. Le Callet, “Llm- based interaction for content generation: A case study on the perception of employees in an it department,” in Proceedings of the 2023 ACM International Conference on Interactive Media Experiences , 2023, pp. 237–241
work page 2023
-
[22]
Gender, age, and technology education influence the adoption and appropriation of llms,
F. Draxler, D. Buschek, M. Tavast, P. H ¨am¨al¨ainen, A. Schmidt, J. Kul- shrestha, and R. Welsch, “Gender, age, and technology education influence the adoption and appropriation of llms,” arXiv preprint arXiv:2310.06556, 2023
-
[23]
User acceptance of information technology: Toward a unified view,
V . Venkatesh, M. G. Morris, G. B. Davis, and F. D. Davis, “User acceptance of information technology: Toward a unified view,” MIS Quarterly, vol. 27, no. 3, pp. 425–478, 2003. [Online]. Available: http://www.jstor.org/stable/30036540
-
[24]
A primer on partial least squares structural equation modeling (pls-sem),
J. F. Hair Junior, G. T. M. Hult, C. M. Ringle, and M. Sarstedt, “A primer on partial least squares structural equation modeling (pls-sem),” 2014
work page 2014
-
[25]
Pls-sem for software engineering research: An introduction and survey,
D. Russo and K.-J. Stol, “Pls-sem for software engineering research: An introduction and survey,” ACM Computing Surveys (CSUR), vol. 54, no. 4, pp. 1–38, 2021
work page 2021
-
[26]
B. A. Kitchenham and S. L. Pfleeger, “Personal opinion surveys,” in Guide to advanced empirical software engineering . Springer, 2008, pp. 63–92
work page 2008
-
[27]
D. Andrews, B. Nonnecke, and J. Preece, “Conducting research on the internet:: Online survey design, development and implementation guidelines,” 2007
work page 2007
-
[28]
Data quality of platforms and panels for online behavioral research,
P. Eyal, R. David, G. Andrew, E. Zak, and D. Ekaterina, “Data quality of platforms and panels for online behavioral research,” Behavior research methods, pp. 1–20, 2021
work page 2021
-
[29]
B. D. Douglas, P. J. Ewell, and M. Brauer, “Data quality in on- line human-subjects research: Comparisons between mturk, prolific, cloudresearch, qualtrics, and sona,”Plos one, vol. 18, no. 3, p. e0279720, 2023
work page 2023
-
[30]
Recruiting software engineers on prolific,
D. Russo, “Recruiting software engineers on prolific,” arXiv preprint arXiv:2203.14695, 2022
-
[31]
A. Danilova, A. Naiakshina, S. Horstmann, and M. Smith, “Do you really code? designing and evaluating screening questions for online surveys with programmers,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) . IEEE, 2021, pp. 537– 548
work page 2021
-
[32]
Statistical power analyses using g* power 3.1: Tests for correlation and regression analyses,
F. Faul, E. Erdfelder, A. Buchner, and A.-G. Lang, “Statistical power analyses using g* power 3.1: Tests for correlation and regression analyses,” Behavior research methods , vol. 41, no. 4, pp. 1149–1160, 2009
work page 2009
- [33]
- [34]
- [35]
-
[36]
Empirical standards for software engineering research,
P. Ralph, N. b. Ali, S. Baltes, D. Bianculli, J. Diaz, Y . Dittrich, N. Ernst, M. Felderer, R. Feldt, A. Filieri et al., “Empirical standards for software engineering research,” arXiv preprint arXiv:2010.03525 , 2020
-
[37]
A. Alami, M. Zahedi, and N. Ernst, “Are you a real software engineer? best practices in online recruitment for software engineering studies,” in Proceedings of the 1st IEEE/ACM International Workshop on Method- ological Issues with Empirical Studies in Software Engineering , 2024, pp. 52–57
work page 2024
-
[38]
L. Perri. (2023) What’s new in artificial intelligence from the 2023 gart- ner hype cycle. [Online]. Available: https://www.gartner.com/en/articles/ what-s-new-in-artificial-intelligence-from-the-2023-gartner-hype-cycle
work page 2023
-
[39]
S. Lambiase, G. Catolino, F. Palomba, F. Ferrucci, and D. Russo, “Exploring individual factors in the adoption of llms for specific software engineering tasks — online appendix,” 2025. [Online]. Available: https://figshare.com/s/c0d84aafdd5c57dd9099
work page 2025
-
[40]
A new criterion for assessing discriminant validity in variance-based structural equation modeling,
J. Henseler, C. M. Ringle, and M. Sarstedt, “A new criterion for assessing discriminant validity in variance-based structural equation modeling,” Journal of the academy of marketing science , vol. 43, pp. 115–135, 2015
work page 2015
-
[41]
The partial least squares approach to structural equation modeling,
W. W. Chin et al. , “The partial least squares approach to structural equation modeling,” Modern methods for business research , vol. 295, no. 2, pp. 295–336, 1998
work page 1998
-
[42]
On the value rel- evance of customer satisfaction. multiple drivers and multiple markets,
S. Raithel, M. Sarstedt, S. Scharf, and M. Schwaiger, “On the value rel- evance of customer satisfaction. multiple drivers and multiple markets,” Journal of the academy of marketing science , vol. 40, pp. 509–525, 2012
work page 2012
-
[43]
A. Eckhardt, S. Laumer, and T. Weitzel, “Who influences whom? analyzing workplace referents’ social influence on it adoption and non- adoption,” Journal of Information Technology, vol. 24, pp. 11–24, 2009
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.