To LLM, or Not to LLM: How Designers and Developers Navigate LLMs as Tools or Teammates
Pith reviewed 2026-05-15 11:54 UTC · model grok-4.3
The pith
Designers and developers position LLMs as either controllable tools or collaborative teammates, which determines how they assign authority, accountability, and oversight in workflows.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Participants consistently framed LLMs either as tools that remain under direct human control or as teammates that share or blur agency with humans. Tool framings preserved clear lines of decision authority and accountability ownership, allowing integration into existing organisational structures. Teammate framings raised concerns about ambiguous responsibility for outcomes, though some participants described productive teammate arrangements when explicit oversight mechanisms were kept in place. The resulting analytic rubric maps role framing directly onto shifts in authority, accountability, oversight strategies, and organisational acceptability.
What carries the argument
The tool and teammate framings together with the analytic rubric that shows how each framing alters decision authority, accountability ownership, oversight strategies, and organisational acceptability.
If this is right
- Tool-framed LLMs can be adopted inside existing governance structures without new accountability rules.
- Teammate-framed LLMs require explicit oversight structures to remain organisationally acceptable.
- Ambiguous agency in teammate framings blocks clear justification of responsibility for outcomes.
- Productive teammate use emerges only when collaborative reasoning stays embedded in human oversight.
- The choice between framings is made at design time and shapes downstream workflow integration.
Where Pith is reading between the lines
- Interface designs that make the intended role framing explicit could reduce hesitation around LLM use.
- The same framing logic may appear in other knowledge-work domains such as legal or medical drafting.
- Organisations could codify role-framing guidelines to speed consistent decision-making about new models.
- Training materials that teach designers to articulate their chosen framing might improve accountability documentation.
Load-bearing premise
The framings observed in interviews with 33 participants from three large technology organisations represent stable, general patterns that apply beyond these specific contexts and without major influence from unexamined organisational cultures or recruitment biases.
What would settle it
A replication study in a different industry or smaller organisation that finds participants rely on entirely different role categories or show no consistent distinction between tool and teammate framings when deciding on LLM use.
Figures
read the original abstract
Large language models (LLMs) are increasingly integrated into design and development workflows, yet decisions about their use are rarely binary or purely technical. We report findings from a constructivist grounded theory study based on interviews with 33 designers and developers across three large technology organisations. Rather than evaluating LLMs solely by capability, participants reasoned about the role an LLM could occupy within a workflow and how that role would interact with existing structures of responsibility and organisational accountability. When LLMs were framed as tools under clear human control, their use was typically acceptable and could be integrated within existing governance structures. When framed as teammates with shared or ambiguous agency, practitioners expressed hesitation, particularly when responsibility for outcomes could not be clearly justified. At the same time, participants also described productive teammate configurations in which LLMs supported collaborative reasoning while remaining embedded within explicit oversight structures. We identify tool and teammate framings as recurring ways in which designers and developers position LLMs relative to human work and present an analytic rubric describing how role framing shapes decision authority, accountability ownership, oversight strategies, and organisational acceptability. By foregrounding design-time reasoning, this work reframes To LLM or Not to LLM as a sociotechnical positioning problem that emerges during system design rather than during post-deployment evaluation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reports findings from a constructivist grounded theory study of 33 interviews with designers and developers across three large technology organizations. It identifies recurring 'tool' and 'teammate' framings of LLMs relative to human work and presents an analytic rubric showing how these framings shape decision authority, accountability ownership, oversight strategies, and organizational acceptability. The central claim reframes LLM adoption decisions as sociotechnical positioning problems arising during system design.
Significance. If the patterns and rubric hold, the work offers a useful analytic lens for HCI research on AI integration in professional workflows, foregrounding organizational accountability structures over purely technical capability assessments and potentially informing design guidelines for LLM use.
major comments (2)
- [Methods] Methods section: The abstract outlines a constructivist grounded theory approach with 33 interviews but provides no details on coding procedures, saturation criteria, or resolution of contradictions in participant accounts. These omissions are load-bearing for the central claim that tool and teammate framings are recurring and that the rubric reliably captures their effects on authority, accountability, and acceptability.
- [Findings / Discussion] Findings and discussion: The rubric is presented as transferable without explicit scope limitations or cross-context validation; the sample is restricted to three large technology organizations, where accountability structures and risk tolerance may systematically differ from startups, smaller firms, or non-tech domains, weakening the claim that the framings represent stable patterns of reasoning.
minor comments (2)
- [Abstract] Abstract: The final sentence is long and compound; splitting it would improve readability while preserving the reframing claim.
- [Findings] Terminology: 'Teammate' framing is used both for ambiguous agency (hesitation) and for productive collaborative configurations; a brief clarification of subtypes early in the findings would reduce potential reader confusion.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. We address each major point below and have prepared revisions to improve methodological transparency and clarify the scope of our claims.
read point-by-point responses
-
Referee: [Methods] Methods section: The abstract outlines a constructivist grounded theory approach with 33 interviews but provides no details on coding procedures, saturation criteria, or resolution of contradictions in participant accounts. These omissions are load-bearing for the central claim that tool and teammate framings are recurring and that the rubric reliably captures their effects on authority, accountability, and acceptability.
Authors: We agree that the Methods section requires greater detail to support the claims. Although the full manuscript describes the overall constructivist grounded theory approach and interview protocol, it does not explicitly document the coding procedures, saturation criteria, or how contradictions across accounts were resolved. In the revised manuscript we will expand the Methods section to include these elements: a description of the iterative open and axial coding process, the criteria used to determine theoretical saturation, and the constant-comparison techniques employed to reconcile divergent participant accounts. revision: yes
-
Referee: [Findings / Discussion] Findings and discussion: The rubric is presented as transferable without explicit scope limitations or cross-context validation; the sample is restricted to three large technology organizations, where accountability structures and risk tolerance may systematically differ from startups, smaller firms, or non-tech domains, weakening the claim that the framings represent stable patterns of reasoning.
Authors: We accept the need for clearer scope limitations. The study is indeed confined to three large technology organizations, and we will add an explicit limitations paragraph in the Discussion that notes potential differences in accountability structures, risk tolerance, and governance practices in startups, smaller firms, or non-technology domains. At the same time, we maintain that the tool and teammate framings are presented as recurring patterns observed within the sampled contexts rather than as universally stable across all organizational settings; the rubric is offered as an analytic lens for examining sociotechnical positioning, not as a validated general model. We will revise the language in the Findings and Discussion to avoid any implication of broad transferability without further empirical validation. revision: partial
Circularity Check
No circularity: empirical grounded theory analysis of interview data
full rationale
The paper reports findings from a constructivist grounded theory study of 33 interviews across three organizations. It identifies recurring tool and teammate framings and presents an analytic rubric describing their effects on authority, accountability, oversight, and acceptability. No mathematical derivations, parameter fitting, or self-citation chains exist; the central claims are direct summaries of observed participant reasoning rather than outputs that reduce to the inputs by construction. The analysis is self-contained as an empirical report of sociotechnical positioning patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Constructivist grounded theory provides a valid way to surface practitioners' reasoning about technology roles from interview data.
Reference graph
Works this paper leans on
-
[1]
Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Thomas Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human- AI Interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, ...
-
[2]
Gagan Bansal, Tongshuang Wu, and Joyce Zhou. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 592, 16 pages. doi:10.1145/3411764.3445088
-
[3]
Anol Bhattacherjee. 2001. Understanding Information Systems Continuance: An Expectation-Confirmation Model.MIS Quarterly25, 3 (2001), 351–370. doi:10. 2307/3250921
work page 2001
-
[4]
2014.Constructing Grounded Theory(2 ed.)
Kathy Charmaz. 2014.Constructing Grounded Theory(2 ed.). SAGE Publications Ltd, London, United Kingdom
work page 2014
-
[5]
Malin Eiband, Daniel Buschek, Heinrich Hussmann, and Alexander Butz. 2018. Bringing Transparency Design into Practice. InProceedings of the 23rd Inter- national Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, USA, 211–223. doi:10.1145/3172944.3172961
-
[6]
Kevin A. Hoff and Masooda Bashir. 2015. Trust in Automation: Integrating Empirical Evidence on Factors That Influence Trust.Human Factors57, 3 (2015), 407–434. doi:10.1177/0018720814554227
-
[7]
Himanshu Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA...
-
[8]
John D. Lee and Katrina A. See. 2004. Trust in Automation: Designing for Appropriate Reliance.Human Factors46, 1 (2004), 50–80. doi:10.1518/hfes.46.1. 50_30392
-
[9]
Vera Liao, Daniel Gruen, and Sarah Miller
Q. Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: In- forming Design Practices for Explainable AI User Experiences. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Associ- ation for Computing Machinery, New York, NY, USA, Article 416, 15 pages. doi:10.1145/3313831.3376590
-
[10]
Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach
Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach
-
[11]
Guo, Robert DeLine, and Sumit Gulwani
Co-designing Checklists to Understand Organizational Challenges and Opportunities Around Fairness in AI. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 693, 14 pages. doi:10.1145/3313831.3376445
-
[12]
Aoife O’Driscoll and Alan F. Blackwell. 2025. Social Norms, Social AI: Investigat- ing the Effects of AI (Im)politeness and Gender on User Perception. InProceedings of BCS Human-Computer Interaction Conference 2025. BCS Learning & Develop- ment Ltd. doi:10.14236/ewic/BCSHCI2025.66
-
[13]
Richard L. Oliver. 1980. A Cognitive Model of the Antecedents and Consequences of Satisfaction Decisions.Journal of Marketing Research17, 4 (1980), 460–469
work page 1980
- [14]
-
[15]
Haazique Sayyed, Meshari Alwazae, and Varad Vishwarupe. 2025. BlockSafe: Universal Blockchain-Based Identity Management. InBig Data in Finance: Trans- forming the Financial Landscape: Volume 2. Springer Nature Switzerland, Cham, 57–66
work page 2025
-
[16]
Ben Shneiderman. 2020. Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy.International Journal of Human-Computer Interaction36, 6 (2020), 495–504. doi:10.1080/10447318.2020.1741118
-
[17]
Lucy A. Suchman. 2007.Human-Machine Reconfigurations: Plans and Situated Actions(2 ed.). Cambridge University Press, Cambridge, United Kingdom
work page 2007
-
[18]
Varad Vishwarupe, Mangesh Bedekar, Milind Pande, and Anil Hiwale. 2018. Intelligent Twitter Spam Detection: A Hybrid Approach. InSmart Trends in Systems, Security and Sustainability. Springer Singapore, Singapore, 189–197
work page 2018
-
[19]
Sheena Rani, Meshari Alwazae, Haazique Sayyed, Vishal Pawar, Vidya Kamma, and Priyanka Kuklani
Varad Vishwarupe, Alexander Hankey, Shailesh Pangaonkar, Shwetanshu Shekhar, R. Sheena Rani, Meshari Alwazae, Haazique Sayyed, Vishal Pawar, Vidya Kamma, and Priyanka Kuklani. 2025. Predicting Mental Health Ailments Using Social Media Activities and Keystroke Dynamics with Machine Learning. InBig Data in Finance: Transforming the Financial Landscape: Volu...
-
[20]
Varad Vishwarupe, Prachi Joshi, Shrey Maheshwari, Priyanka Kuklani, Prathamesh Shingote, Milind Pande, Vishal Pawar, and Aseem Deshmukh. 2023. Exploring Human Computer Interaction in Industry 4.0. InAI, IoT, Big Data and Cloud Computing for Industry 4.0. Springer, 21–38. doi:10.1007/978-3-031-29713- 7_2 CHI EA ’26, April 13–17, 2026, Barcelona, Spain Vish...
-
[21]
Joshi, Nicole Mathias, Shrey Maheshwari, Shweta Mhaisalkar, and Vishal Pawar
Varad Vishwarupe, Prachi M. Joshi, Nicole Mathias, Shrey Maheshwari, Shweta Mhaisalkar, and Vishal Pawar. 2022. Explainable AI and Interpretable Machine Learning: A Case Study in Perspective.Procedia Computer Science204 (2022), 869–876. doi:10.1016/j.procs.2022.08.105
-
[22]
Varad Vishwarupe, Shrey Maheshwari, Aseem Deshmukh, Shweta Mhaisalkar, Prachi M. Joshi, and Nicole Mathias. 2022. Bringing Humans at the Epicenter of Artificial Intelligence: A Confluence of AI, HCI and Human Centered Computing. Procedia Computer Science204 (2022), 914–921. doi:10.1016/j.procs.2022.08.111
-
[23]
I’m accountable, so I need to be able to stand behind the deci- sion
Saniya Zahoor, Mangesh Bedekar, Vinod Mane, and Varad Vishwarupe. 2016. Uniqueness in User Behavior While Using the Web. InProceedings of the In- ternational Congress on Information and Communication Technology. Springer Singapore, Singapore, 221–228. A APPENDIX A: Interview Guide Interviews were semi-structured and adaptive. Questions were used flexibly ...
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.