From Preventive to Reactive: How AI Coding Assistants Transform Developers' Security Awareness

Annoor Sharara Akhand; Faisal Haque Bappy; Raiful Hasan; Sidratul Muntaher Meheraj; Tahrim Hossain; Tarannum Shaila Zaman; Tariqul Islam; Tasfia Tabassum

arxiv: 2605.23130 · v1 · pith:2H5BSJDVnew · submitted 2026-05-22 · 💻 cs.HC · cs.CR

From Preventive to Reactive: How AI Coding Assistants Transform Developers' Security Awareness

Faisal Haque Bappy , Tahrim Hossain , Sidratul Muntaher Meheraj , Annoor Sharara Akhand , Tasfia Tabassum , Tarannum Shaila Zaman , Raiful Hasan , Tariqul Islam This is my paper

Pith reviewed 2026-05-25 04:04 UTC · model grok-4.3

classification 💻 cs.HC cs.CR

keywords AI coding assistantssecurity awarenesssoftware developmentpreventive securityreactive securitydeveloper practicesAI-assisted coding

0 comments

The pith

AI coding assistants reorganize security thinking by shifting it from writing code to reviewing code rather than removing it.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that AI coding assistants do not eliminate developers' attention to security but relocate it from the initial act of generating code to the later act of inspecting it. This move from preventive to reactive security arises because the tools present code generation primarily as a functional task. None of the 15 participants in the coding sessions included security requirements in their first prompts, even when they knew the relevant details, showing a gap between awareness and action. The pattern held across different levels of experience with AI tools. These observations matter because they explain how everyday use of AI assistants changes the human side of writing secure software.

Core claim

AI coding assistants reorganize rather than eliminate security thinking, shifting it from the act of writing code to the act of reviewing it. This transition from preventive to reactive security is structurally encouraged by interaction models that frame code generation as a functional task, leaving security as an afterthought. Notably, none of the coding session participants specified security requirements in their initial prompts, even when they possessed the relevant knowledge, revealing a decoupling of security awareness from security behavior. Developers had independently invented informal coping strategies to manage AI security risk, none of which are supported by current tools or by,

What carries the argument

The shift from preventive to reactive security, driven by interaction models that treat code generation as a mainly functional task.

If this is right

Security awareness becomes decoupled from security behavior when developers use AI coding assistants.
Developers rely on self-created informal strategies to address security risks from AI-generated code.
Experience level with AI tools does not reliably determine how well security is handled.
Current AI tools and organizations provide no support for the coping strategies developers have created.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Design changes to AI assistants could insert security prompts at the start of code generation sessions.
Training for secure coding may need to emphasize review skills more than initial prompting skills.
Organizations could update policies to require explicit security checks early when AI tools are in use.

Load-bearing premise

Semi-structured interviews and observed coding sessions with 15 participants accurately capture how security awareness changes in authentic ongoing development work outside the study.

What would settle it

A field study that logs actual developer prompts and code reviews over multiple months in normal work settings and checks whether security requirements appear in initial prompts or only during later review.

read the original abstract

AI coding assistants are now central to professional software development, yet their impact on how developers think about and practice security remains poorly understood. While prior work has documented vulnerability rates in AI-generated code, a more fundamental question persists: how do these tools transform security awareness in authentic, ongoing development practice? We conducted semi-structured interviews with 15 professional software engineers and observed them completing security-relevant coding tasks with AI assistance, spanning 3 experience cohorts defined by their relationship to AI tools during professional formation. We find that AI coding assistants reorganize rather than eliminate security thinking, shifting it from the act of writing code to the act of reviewing it. This transition from preventive to reactive security is structurally encouraged by interaction models that frame code generation as a functional task, leaving security as an afterthought. Notably, none of our coding session participants specified security requirements in their initial prompts, even when they possessed the relevant knowledge, revealing a decoupling of security awareness from security behavior. We further document informal coping strategies developers had independently invented to manage AI security risk, none of which are supported by current tools or organizations, and find that the experience cohort did not reliably predict security performance. This paper contributes a practice-grounded account of how AI-assisted development reshapes the human side of secure coding, offering empirical foundations for the design of more security-aware tools, training programs, and organizational policies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that AI coding assistants reorganize rather than eliminate developers' security awareness, shifting it from preventive thinking during code writing to reactive review of generated code. This transition is structurally encouraged by interaction models that treat code generation as a functional task. Evidence comes from semi-structured interviews and observed security-relevant coding sessions with 15 professional software engineers across three experience cohorts defined by their relationship to AI tools; none of the participants specified security requirements in initial prompts despite possessing relevant knowledge, and the study documents informal coping strategies while finding that cohort membership did not reliably predict security performance.

Significance. If the central reorganization claim holds, the work supplies a practice-grounded qualitative account that extends vulnerability-rate studies by documenting how AI tools reshape the human side of secure coding. This could inform the design of security-aware AI assistants, training programs, and organizational policies in HCI and software security.

major comments (2)

[Methods] Methods section: the description of the observed coding sessions does not establish that the security-relevant tasks were framed identically to real-world prompts or that the presence of observers did not alter participants' prompting behavior; this directly undercuts the claim that the absence of security requirements in initial prompts reflects a tool-induced decoupling rather than study demand characteristics.
[Results] Results/Discussion: the single-session snapshot with 15 participants across three cohorts provides no evidence that the observed preventive-to-reactive shift captures cumulative effects over ongoing authentic development practice spanning weeks, leaving the structural claim about interaction models insufficiently secured for generalization.

minor comments (1)

[Abstract/Methods] The abstract and methods refer to '3 experience cohorts defined by their relationship to AI tools during professional formation' without specifying the exact criteria or recruitment details used to assign participants; adding this would improve replicability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these detailed and constructive comments, which highlight important considerations for methodological transparency and the scope of our claims. We address each point below and indicate planned revisions to the manuscript.

read point-by-point responses

Referee: [Methods] Methods section: the description of the observed coding sessions does not establish that the security-relevant tasks were framed identically to real-world prompts or that the presence of observers did not alter participants' prompting behavior; this directly undercuts the claim that the absence of security requirements in initial prompts reflects a tool-induced decoupling rather than study demand characteristics.

Authors: We agree that the Methods section would benefit from greater detail on task framing and potential observer effects to strengthen the interpretation of the prompting data. In the revised version, we will expand this section to describe how the security-relevant tasks were derived from scenarios reported in the interviews as representative of participants' professional work, including the exact wording used to instruct participants to proceed as they normally would with AI assistance. We will also add discussion of the observational setup, including efforts to conduct sessions in familiar environments and the use of think-aloud protocols to surface reasoning. While observational studies inherently carry some risk of demand characteristics, the consistency between observed behavior and participants' self-reported daily practices in the interviews provides triangulation supporting the decoupling claim. This revision will make the evidential basis more explicit without overstating the controls. revision: yes
Referee: [Results] Results/Discussion: the single-session snapshot with 15 participants across three cohorts provides no evidence that the observed preventive-to-reactive shift captures cumulative effects over ongoing authentic development practice spanning weeks, leaving the structural claim about interaction models insufficiently secured for generalization.

Authors: The referee accurately notes a key limitation of the study design. Our data consist of single-session observations supplemented by retrospective accounts from semi-structured interviews about ongoing use. In the revised manuscript, we will update the Discussion to more explicitly state that the preventive-to-reactive reorganization is evidenced in the immediate prompting and review behaviors observed, with interviews providing supporting context on how these patterns manifest in daily work, but that longitudinal tracking would be needed to confirm cumulative effects over weeks. We will temper the structural claim accordingly, framing it as grounded in the documented interaction patterns rather than asserting broad generalization, and suggest future research directions. This addresses the concern while preserving the contribution of the observed shift. revision: partial

Circularity Check

0 steps flagged

Empirical qualitative study with no equations, parameters, or derivations exhibits no circularity

full rationale

The paper reports findings from semi-structured interviews and observed coding sessions with 15 participants across experience cohorts. All central claims (reorganization of security thinking from preventive to reactive, decoupling of awareness from behavior, informal coping strategies) are presented as direct outputs of thematic analysis of the collected interview transcripts and session observations. No equations, fitted parameters, uniqueness theorems, or ansatzes appear. No self-citations are invoked as load-bearing justifications for the core results; prior work is referenced only for context on vulnerability rates. The derivation chain is therefore self-contained and does not reduce any result to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an empirical qualitative study and rests on standard domain assumptions of interview-based HCI research rather than mathematical axioms, free parameters, or new postulated entities.

axioms (1)

domain assumption Semi-structured interviews and observational studies can reliably surface changes in security awareness and behavior in professional developers.
Invoked to support interpretation of findings from the 15 participants and task observations.

pith-pipeline@v0.9.0 · 5814 in / 1234 out tokens · 38438 ms · 2026-05-25T04:04:03.047275+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We conducted semi-structured interviews with 15 professional software engineers and observed them completing security-relevant coding tasks with AI assistance... none of our coding session participants specified security requirements in their initial prompts.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

AI coding assistants reorganize rather than eliminate security thinking, shifting it from the act of writing code to the act of reviewing it.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 2 internal anchors

[1]

You get where you’re looking for: The impact of information sources on code security

Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. You get where you’re looking for: The impact of information sources on code security. In2016 IEEE symposium on security and privacy (SP), pages 289–305. IEEE, 2016

work page 2016
[2]

John Wiley & Sons, 2010

Ross Anderson.Security engineering: a guide to build- ing dependable distributed systems. John Wiley & Sons, 2010

work page 2010
[3]

Anthropic. Claude. https://claude.com/product /overview, 2026. Accessed: 2026-02-19

work page 2026
[4]

Chainforge: A visual toolkit for prompt engineering and llm hypothesis testing

Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Mar- tin Wattenberg, and Elena L Glassman. Chainforge: A visual toolkit for prompt engineering and llm hypothesis testing. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pages 1–18, 2024

work page 2024
[5]

Updates in human-ai teams: Understanding and addressing the per- formance/compatibility tradeoff

Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, and Eric Horvitz. Updates in human-ai teams: Understanding and addressing the per- formance/compatibility tradeoff. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 2429–2437, 2019

work page 2019
[6]

Grounded copilot: How programmers inter- act with code-generating models.Proceedings of the ACM on Programming Languages, 7(OOPSLA1):85– 111, 2023

Shraddha Barke, Michael B James, and Nadia Polikar- pova. Grounded copilot: How programmers inter- act with code-generating models.Proceedings of the ACM on Programming Languages, 7(OOPSLA1):85– 111, 2023

work page 2023
[7]

Checking for race conditions in file accesses.Computing systems, 2(2):131–152, 1996

Matt Bishop, Michael Dilger, et al. Checking for race conditions in file accesses.Computing systems, 2(2):131–152, 1996

work page 1996
[8]

Two studies of oppor- tunistic programming: interleaving web foraging, learn- ing, and writing code

Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. Two studies of oppor- tunistic programming: interleaving web foraging, learn- ing, and writing code. InProceedings of the SIGCHI conference on human factors in computing systems, pages 1589–1598, 2009

work page 2009
[9]

Using thematic anal- ysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006

Virginia Braun and Victoria Clarke. Using thematic anal- ysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006

work page 2006
[10]

Visibility into ai agents

Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Ni- tarshan Rajkumar, David Krueger, Noam Kolt, et al. Visibility into ai agents. InProceedings of the 2024 ACM conference on fairness, accountability, and trans- parency, pages 958–973, 2024

work page 2024
[11]

Constructing grounded theory (intro- ducing qualitative methods series).Constr

Kathy Charmaz. Constructing grounded theory (intro- ducing qualitative methods series).Constr. grounded theory, 2014

work page 2014
[12]

Learning agent- based modeling with llm companions: Experiences of novices and experts using chatgpt & netlogo chat

John Chen, Xi Lu, Yuzhou Du, Michael Rejtig, Ruth Bagley, Mike Horn, and Uri Wilensky. Learning agent- based modeling with llm companions: Experiences of novices and experts using chatgpt & netlogo chat. In Proceedings of the 2024 CHI conference on human fac- tors in computing systems, pages 1–18, 2024

work page 2024
[13]

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[14]

Need help? designing proactive ai assistants for programming

Valerie Chen, Alan Zhu, Sebastian Zhao, Hussein Mozannar, David Sontag, and Ameet Talwalkar. Need help? designing proactive ai assistants for programming. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–18, 2025

work page 2025
[15]

code-server: VS Code in the Browser

Coder Technologies Inc. code-server: VS Code in the Browser. https://github.com/coder/code-ser ver, 2026. MIT License. GitHub repository. Accessed: 2026-02-19

work page 2026
[16]

Adopting {AI} to protect industrial control systems: Assessing challenges and opportunities from the {Operators’} per- spective

Clement Fung, Eric Zeng, and Lujo Bauer. Adopting {AI} to protect industrial control systems: Assessing challenges and opportunities from the {Operators’} per- spective. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 555–573, 2025

work page 2025
[17]

Github copilot: Your ai pair programmer

GitHub. Github copilot: Your ai pair programmer. https://github.com/features/copilot , 2026. Accessed: 2026-02-19

work page 2026
[18]

Snowball sampling.The annals of mathematical statistics, pages 148–170, 1961

Leo A Goodman. Snowball sampling.The annals of mathematical statistics, pages 148–170, 1961

work page 1961
[19]

Gemini CLI: An Open-Source AI Agent for the Terminal

Google. Gemini CLI: An Open-Source AI Agent for the Terminal. https://github.com/google-gemin i/gemini-cli , 2025. Apache 2.0 License. GitHub repository. Accessed: 2026-02-19

work page 2025
[20]

Why don’t software developers use static analysis tools to find bugs? In2013 35th Inter- national Conference on Software Engineering (ICSE), pages 672–681

Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. Why don’t software developers use static analysis tools to find bugs? In2013 35th Inter- national Conference on Software Engineering (ICSE), pages 672–681. IEEE, 2013

work page 2013
[21]

Research: Quantifying GitHub Copilot’s Impact on Developer Productivity and Hap- piness

Eirini Kalliamvakou. Research: Quantifying GitHub Copilot’s Impact on Developer Productivity and Hap- piness. https://github.blog/news-insights/re search/research-quantifying-github-copilot s-impact-on-developer-productivity-and-hap piness/, 2022. GitHub Blog. Updated May 21, 2024. Accessed: 2026-02-19. 13

work page 2022
[22]

Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems

Rafal Kocielnik, Saleema Amershi, and Paul N Bennett. Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems. InPro- ceedings of the 2019 CHI conference on human factors in computing systems, pages 1–14, 2019

work page 2019
[23]

Integrating large language models into security incident response

Diana Kramer, Lambert Rosique, Ajay Narotam, Elie Bursztein, Patrick Gage Kelley, Kurt Thomas, and Alli- son Woodruff. Integrating large language models into security incident response. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 133–148, 2025

work page 2025
[24]

Closing the loop between user stories and gui prototypes: an llm-based assistant for cross-functional integration in software development

Felix Kretzer, Kristian Kolthoff, Christian Bartelt, Si- mone Paolo Ponzetto, and Alexander Maedche. Closing the loop between user stories and gui prototypes: an llm-based assistant for cross-functional integration in software development. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–19, 2025

work page 2025
[25]

Trust in automation: Designing for appropriate reliance.Human factors, 46(1):50–80, 2004

John D Lee and Katrina A See. Trust in automation: Designing for appropriate reliance.Human factors, 46(1):50–80, 2004

work page 2004
[26]

Yoonjoo Lee, Kihoon Son, Tae Soo Kim, Jisu Kim, John Joon Young Chung, Eytan Adar, and Juho Kim. One vs. many: Comprehending accurate information from mul- tiple erroneous and inconsistent ai generations. InPro- ceedings of the 2024 ACM Conference on Fairness, Ac- countability, and Transparency, pages 2518–2531, 2024

work page 2024
[27]

Learning about responsible ai on-the-job: Learn- ing pathways, orientations, and aspirations

Michael Madaio, Shivani Kapania, Rida Qadri, Ding Wang, Andrew Zaldivar, Remi Denton, and Lauren Wilcox. Learning about responsible ai on-the-job: Learn- ing pathways, orientations, and aspirations. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 1544–1558, 2024

work page 2024
[28]

Should users trust advanced ai assistants? justified trust as a function of competence and alignment

Arianna Manzini, Geoff Keeling, Nahema Marchal, Kevin R McKee, Verena Rieser, and Iason Gabriel. Should users trust advanced ai assistants? justified trust as a function of competence and alignment. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 1174–1186, 2024

work page 2024
[29]

Software security.IEEE Security & Privacy, 2(2):80–83, 2004

Gary McGraw. Software security.IEEE Security & Privacy, 2(2):80–83, 2004

work page 2004
[30]

Codea11y: Making ai coding assistants useful for accessible web development

Peya Mowar, Yi-Hao Peng, Jason Wu, Aaron Steinfeld, and Jeffrey P Bigham. Codea11y: Making ai coding assistants useful for accessible web development. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–15, 2025

work page 2025
[31]

OpenAI. Chatgpt. https://openai.com/chatgpt ,

work page
[32]

Accessed: 2026-02-19

work page 2026
[33]

No} one can hack my {mind

Anna-Marie Ortloff, Jenny Tang, Arthi Arumugam, Daniel Huschina, Lisa Geierhaas, Florin Martius, Luisa Jansen, Kolja V on Der Twer, Lilly Jungbluth, and Matthew Smith. Replication: {“No} one can hack my {mind”}-10 years later: An update and outlook on {experts’} and {non-experts’} security practices and advice. InTwenty-First Symposium on Usable Privacy a...

work page 2025
[34]

OWASP Top 10: 2021

OWASP Foundation. OWASP Top 10: 2021. https: //owasp.org/Top10/2021/ , 2021. Accessed: 2026- 02-18

work page 2021
[35]

OW ASP Secure Coding Practices – Quick Reference Guide

OW ASP Foundation. OW ASP Secure Coding Practices – Quick Reference Guide. https://owasp.org/ww w-project-secure-coding-practices-quick-r eference-guide/ , 2022. Version 2.0.1. Accessed: 2026-02-19

work page 2022
[36]

Humans and au- tomation: Use, misuse, disuse, abuse.Human factors, 39(2):230–253, 1997

Raja Parasuraman and Victor Riley. Humans and au- tomation: Use, misuse, disuse, abuse.Human factors, 39(2):230–253, 1997

work page 1997
[37]

Trust development and repair in ai-assisted decision- making during complementary expertise

Saumya Pareek, Eduardo Velloso, and Jorge Goncalves. Trust development and repair in ai-assisted decision- making during complementary expertise. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 546–561, 2024

work page 2024
[38]

SAGE Publications, inc, 1990

Michael Quinn Patton.Qualitative evaluation and re- search methods. SAGE Publications, inc, 1990

work page 1990
[39]

Asleep at the key- board? assessing the security of github copilot’s code contributions.Communications of the ACM, 68(2):96– 105, 2025

Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Bren- dan Dolan-Gavitt, and Ramesh Karri. Asleep at the key- board? assessing the security of github copilot’s code contributions.Communications of the ACM, 68(2):96– 105, 2025

work page 2025
[40]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. The impact of ai on developer productiv- ity: Evidence from github copilot.arXiv preprint arXiv:2302.06590, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[41]

Do users write more insecure code with ai assistants? InProceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 2785–2799, 2023

Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh. Do users write more insecure code with ai assistants? InProceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 2785–2799, 2023

work page 2023
[42]

Assistance or disruption? exploring and evaluating the design and trade-offs of proactive ai programming support

Kevin Pu, Daniel Lazaro, Ian Arawjo, Haijun Xia, Ziang Xiao, Tovi Grossman, and Yan Chen. Assistance or disruption? exploring and evaluating the design and trade-offs of proactive ai programming support. InPro- ceedings of the 2025 CHI conference on human factors in computing systems, pages 1–21, 2025. 14

work page 2025
[43]

Navigating autonomy: unveiling security experts’ perspectives on augmented intelligence in cy- bersecurity

Neele Roch, Hannah Sievers, Lorin Schöni, and Verena Zimmermann. Navigating autonomy: unveiling security experts’ perspectives on augmented intelligence in cy- bersecurity. InTwentieth Symposium on Usable Privacy and Security (SOUPS 2024), pages 41–60, 2024

work page 2024
[44]

Exploring the impact of intervention methods on devel- opers’ security behavior in a manipulated chatgpt study

Raphael Serafini, Asli Yardim, and Alena Naiakshina. Exploring the impact of intervention methods on devel- opers’ security behavior in a manipulated chatgpt study. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–26, 2025

work page 2025
[45]

Transparency in the wild: Navigating trans- parency in a deployed ai system to broaden need-finding approaches

Violet Turri, Katelyn Morrison, Katherine-Marie Robin- son, Collin Abidi, Adam Perer, Jodi Forlizzi, and Rachel Dzombak. Transparency in the wild: Navigating trans- parency in a deployed ai system to broaden need-finding approaches. InProceedings of the 2024 ACM Con- ference on Fairness, Accountability, and Transparency, pages 1494–1514, 2024

work page 2024
[46]

Expectation vs

Priyan Vaithilingam, Tianyi Zhang, and Elena L Glass- man. Expectation vs. experience: Evaluating the usabil- ity of code generation tools powered by large language models. InChi conference on human factors in comput- ing systems extended abstracts, pages 1–7, 2022

work page 2022
[47]

Hackers vs

Daniel V otipka, Rock Stevens, Elissa Redmiles, Jeremy Hu, and Michelle Mazurek. Hackers vs. testers: A com- parison of software vulnerability discovery processes. In2018 IEEE Symposium on Security and Privacy (SP), pages 374–391. IEEE, 2018

work page 2018
[48]

Investigating and designing for trust in ai-powered code generation tools

Ruotong Wang, Ruijia Cheng, Denae Ford, and Thomas Zimmermann. Investigating and designing for trust in ai-powered code generation tools. InProceedings of the 2024 ACM conference on fairness, accountability, and transparency, pages 1475–1493, 2024

work page 2024
[49]

Can you walk me through it? explainable {SMS} phishing detection using {LLM- based} agents

Yizhu Wang, Haoyu Zhai, Chenkai Wang, Qingying Hao, Nick A Cohen, Roopa Foulger, Jonathan A Han- dler, and Gang Wang. Can you walk me through it? explainable {SMS} phishing detection using {LLM- based} agents. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 37–56, 2025

work page 2025
[50]

Examining the use and impact of an ai code assistant on developer productivity and experience in the enterprise

Justin D Weisz, Shraddha Vijay Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Katrin Ellice Heintze, and Shagun Bajpai. Examining the use and impact of an ai code assistant on developer productivity and experience in the enterprise. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–13, 2025

work page 2025
[51]

David Gray Widder, Derrick Zhen, Laura Dabbish, and James Herbsleb. It’s about power: What ethical con- cerns do software engineers have, and what do they (feel they can) do about them? InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pages 467–479, 2023

work page 2023
[52]

In- ide code generation from natural language: Promise and challenges.ACM Transactions on Software Engineering and Methodology (TOSEM), 31(2):1–47, 2022

Frank F Xu, Bogdan Vasilescu, and Graham Neubig. In- ide code generation from natural language: Promise and challenges.ACM Transactions on Software Engineering and Methodology (TOSEM), 31(2):1–47, 2022

work page 2022
[53]

I always validate what comes in from the client before I do anything with it

Jingyue Zhang and Ian Arawjo. Chainbuddy: An ai- assisted agent system for generating llm pipelines. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–21, 2025. A Codebook for Interview Analysis Table 2 presents the full codebook developed through open and focused coding of the 15 interview transcripts. Codes are org...

work page 2025

[1] [1]

You get where you’re looking for: The impact of information sources on code security

Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. You get where you’re looking for: The impact of information sources on code security. In2016 IEEE symposium on security and privacy (SP), pages 289–305. IEEE, 2016

work page 2016

[2] [2]

John Wiley & Sons, 2010

Ross Anderson.Security engineering: a guide to build- ing dependable distributed systems. John Wiley & Sons, 2010

work page 2010

[3] [3]

Anthropic. Claude. https://claude.com/product /overview, 2026. Accessed: 2026-02-19

work page 2026

[4] [4]

Chainforge: A visual toolkit for prompt engineering and llm hypothesis testing

Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Mar- tin Wattenberg, and Elena L Glassman. Chainforge: A visual toolkit for prompt engineering and llm hypothesis testing. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pages 1–18, 2024

work page 2024

[5] [5]

Updates in human-ai teams: Understanding and addressing the per- formance/compatibility tradeoff

Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, and Eric Horvitz. Updates in human-ai teams: Understanding and addressing the per- formance/compatibility tradeoff. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 2429–2437, 2019

work page 2019

[6] [6]

Grounded copilot: How programmers inter- act with code-generating models.Proceedings of the ACM on Programming Languages, 7(OOPSLA1):85– 111, 2023

Shraddha Barke, Michael B James, and Nadia Polikar- pova. Grounded copilot: How programmers inter- act with code-generating models.Proceedings of the ACM on Programming Languages, 7(OOPSLA1):85– 111, 2023

work page 2023

[7] [7]

Checking for race conditions in file accesses.Computing systems, 2(2):131–152, 1996

Matt Bishop, Michael Dilger, et al. Checking for race conditions in file accesses.Computing systems, 2(2):131–152, 1996

work page 1996

[8] [8]

Two studies of oppor- tunistic programming: interleaving web foraging, learn- ing, and writing code

Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. Two studies of oppor- tunistic programming: interleaving web foraging, learn- ing, and writing code. InProceedings of the SIGCHI conference on human factors in computing systems, pages 1589–1598, 2009

work page 2009

[9] [9]

Using thematic anal- ysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006

Virginia Braun and Victoria Clarke. Using thematic anal- ysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006

work page 2006

[10] [10]

Visibility into ai agents

Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Ni- tarshan Rajkumar, David Krueger, Noam Kolt, et al. Visibility into ai agents. InProceedings of the 2024 ACM conference on fairness, accountability, and trans- parency, pages 958–973, 2024

work page 2024

[11] [11]

Constructing grounded theory (intro- ducing qualitative methods series).Constr

Kathy Charmaz. Constructing grounded theory (intro- ducing qualitative methods series).Constr. grounded theory, 2014

work page 2014

[12] [12]

Learning agent- based modeling with llm companions: Experiences of novices and experts using chatgpt & netlogo chat

John Chen, Xi Lu, Yuzhou Du, Michael Rejtig, Ruth Bagley, Mike Horn, and Uri Wilensky. Learning agent- based modeling with llm companions: Experiences of novices and experts using chatgpt & netlogo chat. In Proceedings of the 2024 CHI conference on human fac- tors in computing systems, pages 1–18, 2024

work page 2024

[13] [13]

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[14] [14]

Need help? designing proactive ai assistants for programming

Valerie Chen, Alan Zhu, Sebastian Zhao, Hussein Mozannar, David Sontag, and Ameet Talwalkar. Need help? designing proactive ai assistants for programming. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–18, 2025

work page 2025

[15] [15]

code-server: VS Code in the Browser

Coder Technologies Inc. code-server: VS Code in the Browser. https://github.com/coder/code-ser ver, 2026. MIT License. GitHub repository. Accessed: 2026-02-19

work page 2026

[16] [16]

Adopting {AI} to protect industrial control systems: Assessing challenges and opportunities from the {Operators’} per- spective

Clement Fung, Eric Zeng, and Lujo Bauer. Adopting {AI} to protect industrial control systems: Assessing challenges and opportunities from the {Operators’} per- spective. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 555–573, 2025

work page 2025

[17] [17]

Github copilot: Your ai pair programmer

GitHub. Github copilot: Your ai pair programmer. https://github.com/features/copilot , 2026. Accessed: 2026-02-19

work page 2026

[18] [18]

Snowball sampling.The annals of mathematical statistics, pages 148–170, 1961

Leo A Goodman. Snowball sampling.The annals of mathematical statistics, pages 148–170, 1961

work page 1961

[19] [19]

Gemini CLI: An Open-Source AI Agent for the Terminal

Google. Gemini CLI: An Open-Source AI Agent for the Terminal. https://github.com/google-gemin i/gemini-cli , 2025. Apache 2.0 License. GitHub repository. Accessed: 2026-02-19

work page 2025

[20] [20]

Why don’t software developers use static analysis tools to find bugs? In2013 35th Inter- national Conference on Software Engineering (ICSE), pages 672–681

Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. Why don’t software developers use static analysis tools to find bugs? In2013 35th Inter- national Conference on Software Engineering (ICSE), pages 672–681. IEEE, 2013

work page 2013

[21] [21]

Research: Quantifying GitHub Copilot’s Impact on Developer Productivity and Hap- piness

Eirini Kalliamvakou. Research: Quantifying GitHub Copilot’s Impact on Developer Productivity and Hap- piness. https://github.blog/news-insights/re search/research-quantifying-github-copilot s-impact-on-developer-productivity-and-hap piness/, 2022. GitHub Blog. Updated May 21, 2024. Accessed: 2026-02-19. 13

work page 2022

[22] [22]

Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems

Rafal Kocielnik, Saleema Amershi, and Paul N Bennett. Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems. InPro- ceedings of the 2019 CHI conference on human factors in computing systems, pages 1–14, 2019

work page 2019

[23] [23]

Integrating large language models into security incident response

Diana Kramer, Lambert Rosique, Ajay Narotam, Elie Bursztein, Patrick Gage Kelley, Kurt Thomas, and Alli- son Woodruff. Integrating large language models into security incident response. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 133–148, 2025

work page 2025

[24] [24]

Closing the loop between user stories and gui prototypes: an llm-based assistant for cross-functional integration in software development

Felix Kretzer, Kristian Kolthoff, Christian Bartelt, Si- mone Paolo Ponzetto, and Alexander Maedche. Closing the loop between user stories and gui prototypes: an llm-based assistant for cross-functional integration in software development. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–19, 2025

work page 2025

[25] [25]

Trust in automation: Designing for appropriate reliance.Human factors, 46(1):50–80, 2004

John D Lee and Katrina A See. Trust in automation: Designing for appropriate reliance.Human factors, 46(1):50–80, 2004

work page 2004

[26] [26]

Yoonjoo Lee, Kihoon Son, Tae Soo Kim, Jisu Kim, John Joon Young Chung, Eytan Adar, and Juho Kim. One vs. many: Comprehending accurate information from mul- tiple erroneous and inconsistent ai generations. InPro- ceedings of the 2024 ACM Conference on Fairness, Ac- countability, and Transparency, pages 2518–2531, 2024

work page 2024

[27] [27]

Learning about responsible ai on-the-job: Learn- ing pathways, orientations, and aspirations

Michael Madaio, Shivani Kapania, Rida Qadri, Ding Wang, Andrew Zaldivar, Remi Denton, and Lauren Wilcox. Learning about responsible ai on-the-job: Learn- ing pathways, orientations, and aspirations. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 1544–1558, 2024

work page 2024

[28] [28]

Should users trust advanced ai assistants? justified trust as a function of competence and alignment

Arianna Manzini, Geoff Keeling, Nahema Marchal, Kevin R McKee, Verena Rieser, and Iason Gabriel. Should users trust advanced ai assistants? justified trust as a function of competence and alignment. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 1174–1186, 2024

work page 2024

[29] [29]

Software security.IEEE Security & Privacy, 2(2):80–83, 2004

Gary McGraw. Software security.IEEE Security & Privacy, 2(2):80–83, 2004

work page 2004

[30] [30]

Codea11y: Making ai coding assistants useful for accessible web development

Peya Mowar, Yi-Hao Peng, Jason Wu, Aaron Steinfeld, and Jeffrey P Bigham. Codea11y: Making ai coding assistants useful for accessible web development. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–15, 2025

work page 2025

[31] [31]

OpenAI. Chatgpt. https://openai.com/chatgpt ,

work page

[32] [32]

Accessed: 2026-02-19

work page 2026

[33] [33]

No} one can hack my {mind

Anna-Marie Ortloff, Jenny Tang, Arthi Arumugam, Daniel Huschina, Lisa Geierhaas, Florin Martius, Luisa Jansen, Kolja V on Der Twer, Lilly Jungbluth, and Matthew Smith. Replication: {“No} one can hack my {mind”}-10 years later: An update and outlook on {experts’} and {non-experts’} security practices and advice. InTwenty-First Symposium on Usable Privacy a...

work page 2025

[34] [34]

OWASP Top 10: 2021

OWASP Foundation. OWASP Top 10: 2021. https: //owasp.org/Top10/2021/ , 2021. Accessed: 2026- 02-18

work page 2021

[35] [35]

OW ASP Secure Coding Practices – Quick Reference Guide

OW ASP Foundation. OW ASP Secure Coding Practices – Quick Reference Guide. https://owasp.org/ww w-project-secure-coding-practices-quick-r eference-guide/ , 2022. Version 2.0.1. Accessed: 2026-02-19

work page 2022

[36] [36]

Humans and au- tomation: Use, misuse, disuse, abuse.Human factors, 39(2):230–253, 1997

Raja Parasuraman and Victor Riley. Humans and au- tomation: Use, misuse, disuse, abuse.Human factors, 39(2):230–253, 1997

work page 1997

[37] [37]

Trust development and repair in ai-assisted decision- making during complementary expertise

Saumya Pareek, Eduardo Velloso, and Jorge Goncalves. Trust development and repair in ai-assisted decision- making during complementary expertise. InProceed- ings of the 2024 ACM Conference on Fairness, Account- ability, and Transparency, pages 546–561, 2024

work page 2024

[38] [38]

SAGE Publications, inc, 1990

Michael Quinn Patton.Qualitative evaluation and re- search methods. SAGE Publications, inc, 1990

work page 1990

[39] [39]

Asleep at the key- board? assessing the security of github copilot’s code contributions.Communications of the ACM, 68(2):96– 105, 2025

Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Bren- dan Dolan-Gavitt, and Ramesh Karri. Asleep at the key- board? assessing the security of github copilot’s code contributions.Communications of the ACM, 68(2):96– 105, 2025

work page 2025

[40] [40]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. The impact of ai on developer productiv- ity: Evidence from github copilot.arXiv preprint arXiv:2302.06590, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[41] [41]

Do users write more insecure code with ai assistants? InProceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 2785–2799, 2023

Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh. Do users write more insecure code with ai assistants? InProceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 2785–2799, 2023

work page 2023

[42] [42]

Assistance or disruption? exploring and evaluating the design and trade-offs of proactive ai programming support

Kevin Pu, Daniel Lazaro, Ian Arawjo, Haijun Xia, Ziang Xiao, Tovi Grossman, and Yan Chen. Assistance or disruption? exploring and evaluating the design and trade-offs of proactive ai programming support. InPro- ceedings of the 2025 CHI conference on human factors in computing systems, pages 1–21, 2025. 14

work page 2025

[43] [43]

Navigating autonomy: unveiling security experts’ perspectives on augmented intelligence in cy- bersecurity

Neele Roch, Hannah Sievers, Lorin Schöni, and Verena Zimmermann. Navigating autonomy: unveiling security experts’ perspectives on augmented intelligence in cy- bersecurity. InTwentieth Symposium on Usable Privacy and Security (SOUPS 2024), pages 41–60, 2024

work page 2024

[44] [44]

Exploring the impact of intervention methods on devel- opers’ security behavior in a manipulated chatgpt study

Raphael Serafini, Asli Yardim, and Alena Naiakshina. Exploring the impact of intervention methods on devel- opers’ security behavior in a manipulated chatgpt study. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–26, 2025

work page 2025

[45] [45]

Transparency in the wild: Navigating trans- parency in a deployed ai system to broaden need-finding approaches

Violet Turri, Katelyn Morrison, Katherine-Marie Robin- son, Collin Abidi, Adam Perer, Jodi Forlizzi, and Rachel Dzombak. Transparency in the wild: Navigating trans- parency in a deployed ai system to broaden need-finding approaches. InProceedings of the 2024 ACM Con- ference on Fairness, Accountability, and Transparency, pages 1494–1514, 2024

work page 2024

[46] [46]

Expectation vs

Priyan Vaithilingam, Tianyi Zhang, and Elena L Glass- man. Expectation vs. experience: Evaluating the usabil- ity of code generation tools powered by large language models. InChi conference on human factors in comput- ing systems extended abstracts, pages 1–7, 2022

work page 2022

[47] [47]

Hackers vs

Daniel V otipka, Rock Stevens, Elissa Redmiles, Jeremy Hu, and Michelle Mazurek. Hackers vs. testers: A com- parison of software vulnerability discovery processes. In2018 IEEE Symposium on Security and Privacy (SP), pages 374–391. IEEE, 2018

work page 2018

[48] [48]

Investigating and designing for trust in ai-powered code generation tools

Ruotong Wang, Ruijia Cheng, Denae Ford, and Thomas Zimmermann. Investigating and designing for trust in ai-powered code generation tools. InProceedings of the 2024 ACM conference on fairness, accountability, and transparency, pages 1475–1493, 2024

work page 2024

[49] [49]

Can you walk me through it? explainable {SMS} phishing detection using {LLM- based} agents

Yizhu Wang, Haoyu Zhai, Chenkai Wang, Qingying Hao, Nick A Cohen, Roopa Foulger, Jonathan A Han- dler, and Gang Wang. Can you walk me through it? explainable {SMS} phishing detection using {LLM- based} agents. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), pages 37–56, 2025

work page 2025

[50] [50]

Examining the use and impact of an ai code assistant on developer productivity and experience in the enterprise

Justin D Weisz, Shraddha Vijay Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Katrin Ellice Heintze, and Shagun Bajpai. Examining the use and impact of an ai code assistant on developer productivity and experience in the enterprise. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–13, 2025

work page 2025

[51] [51]

David Gray Widder, Derrick Zhen, Laura Dabbish, and James Herbsleb. It’s about power: What ethical con- cerns do software engineers have, and what do they (feel they can) do about them? InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pages 467–479, 2023

work page 2023

[52] [52]

In- ide code generation from natural language: Promise and challenges.ACM Transactions on Software Engineering and Methodology (TOSEM), 31(2):1–47, 2022

Frank F Xu, Bogdan Vasilescu, and Graham Neubig. In- ide code generation from natural language: Promise and challenges.ACM Transactions on Software Engineering and Methodology (TOSEM), 31(2):1–47, 2022

work page 2022

[53] [53]

I always validate what comes in from the client before I do anything with it

Jingyue Zhang and Ian Arawjo. Chainbuddy: An ai- assisted agent system for generating llm pipelines. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–21, 2025. A Codebook for Interview Analysis Table 2 presents the full codebook developed through open and focused coding of the 15 interview transcripts. Codes are org...

work page 2025