Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot

Monika Swetha Gurupathi; Nalin Arachchilage; Nicol\'as E. D\'iaz Ferreyra; Riccardo Scandariato; Zadia Codabux

arxiv: 2604.08352 · v1 · submitted 2026-04-09 · 💻 cs.SE · cs.CR· cs.HC

Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot

Nicol\'as E. D\'iaz Ferreyra , Monika Swetha Gurupathi , Zadia Codabux , Nalin Arachchilage , Riccardo Scandariato This is my paper

Pith reviewed 2026-05-10 17:15 UTC · model grok-4.3

classification 💻 cs.SE cs.CRcs.HC

keywords generative AIcoding assistantsGitHub Copilotsecurity concernsdata leakageadversarial attacksonline discussionsthematic analysis

0 comments

The pith

Analysis of online developer discussions identifies four primary security concerns with generative AI coding assistants.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors set out to understand the security worries that software developers have about tools like GitHub Copilot by looking at what people say in public online spaces. They gathered relevant posts and comments from three major platforms and used topic modeling to group them before identifying common themes through careful reading. This approach shows that users are particularly anxious about their data being exposed, legal issues with code ownership, ways attackers could manipulate the AI, and the chance that the generated code contains vulnerabilities. Knowing these specific concerns matters because it reveals practical challenges that technical tests alone might miss, helping point toward better safeguards in future versions of these assistants.

Core claim

Through the collection of discussion threads from Stack Overflow, Reddit, and Hacker News concerning security issues in GitHub Copilot, followed by BERTopic clustering and thematic analysis, four major areas of concern emerge: potential data leakage, code licensing, adversarial attacks such as prompt injection, and insecure code suggestions. These findings emphasize the limitations and trade-offs involved in applying generative AI to software engineering tasks.

What carries the argument

BERTopic clustering followed by thematic analysis of developer discussion threads on GitHub Copilot security issues.

Load-bearing premise

The sample of discussions from Stack Overflow, Reddit, and Hacker News accurately reflects the security concerns of software developers in general.

What would settle it

A broad survey of professional developers reporting few or no security concerns with generative AI coding assistants would challenge whether the four areas represent widespread views.

Figures

Figures reproduced from arXiv: 2604.08352 by Monika Swetha Gurupathi, Nalin Arachchilage, Nicol\'as E. D\'iaz Ferreyra, Riccardo Scandariato, Zadia Codabux.

**Figure 1.** Figure 1: Study Design. 3 METHODOLOGY To answer the RQs introduced in Section 1, we curated a dataset of online posts, comments, and discussion threads addressing security issues in GitHub Copilot from three public online forums: Stack Overflow, Reddit, and Hacker News. As shown in [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Provenance Distribution Across Clusters. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Sentiment Distribution Across Platforms. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Generative Artificial Intelligence (GenAI) has become a central component of many development tools (e.g., GitHub Copilot) that support software practitioners across multiple programming tasks, including code completion, documentation, and bug detection. However, current research has identified significant limitations and open issues in GenAI, including reliability, non-determinism, bias, and copyright infringement. While prior work has primarily focused on assessing the technical performance of these technologies for code generation, less attention has been paid to emerging concerns of software developers, particularly in the security realm. OBJECTIVE: This work explores security concerns regarding the use of GenAI-based coding assistants by analyzing challenges voiced by developers and software enthusiasts in public online forums. METHOD: We retrieved posts, comments, and discussion threads addressing security issues in GitHub Copilot from three popular platforms, namely Stack Overflow, Reddit, and Hacker News. These discussions were clustered using BERTopic and then synthesized using thematic analysis to identify distinct categories of security concerns. RESULTS: Four major concern areas were identified, including potential data leakage, code licensing, adversarial attacks (e.g., prompt injection), and insecure code suggestions, underscoring critical reflections on the limitations and trade-offs of GenAI in software engineering. IMPLICATIONS: Our findings contribute to a broader understanding of how developers perceive and engage with GenAI-based coding assistants, while highlighting key areas for improving their built-in security features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps four security concerns from Copilot forum posts but leaves method details thin.

read the letter

The main takeaway is that developers posting about GitHub Copilot online flag four recurring security issues: data leakage risks, code licensing headaches, adversarial attacks like prompt injection, and suggestions that produce insecure code. The authors collected threads from Stack Overflow, Reddit, and Hacker News, clustered them with BERTopic, and ran thematic analysis to group the points. This gives a direct look at what users actually say rather than what benchmarks show. The work does a solid job of grounding the discussion in real practitioner language and linking the concerns to practical trade-offs in GenAI coding tools. It extends existing qualitative work on AI assistants by narrowing in on security perceptions, which prior papers have touched on less. The categories come across as plausible and useful for anyone thinking about tool improvements or adoption barriers. The soft spots sit mostly in the methods. The abstract reports no counts of posts retrieved or filtered, no search terms, and no checks like inter-rater agreement or theme validation steps. Without those, it is difficult to judge how complete or stable the four categories are. Forum samples also tend to capture louder voices and may under-represent developers who work in closed environments and do not post publicly. This is an exploratory study, not a representative survey. It fits readers who track human factors in software engineering or AI safety and want pointers on where Copilot users see friction. The evidence is honest but limited in scope. I would send it for peer review. The question is timely, the approach fits the data, and a referee could push for clearer reporting and perhaps broader sources without changing the core contribution.

Referee Report

2 major / 2 minor

Summary. The paper explores security concerns regarding generative AI coding assistants such as GitHub Copilot by retrieving and analyzing public discussions from Stack Overflow, Reddit, and Hacker News. It applies BERTopic clustering to the collected threads followed by thematic analysis to synthesize four major concern categories: potential data leakage, code licensing issues, adversarial attacks (e.g., prompt injection), and insecure code suggestions. The work positions these findings as insights into developer perceptions and trade-offs in GenAI adoption for software engineering tasks.

Significance. If the results hold, the study adds a developer-centered perspective to the literature on GenAI limitations in software engineering, complementing technical evaluations of reliability, bias, and performance. By drawing directly from forum discussions rather than controlled experiments, it highlights practical security worries that could inform tool improvements and future research on human-AI collaboration in coding.

major comments (2)

[Methods] Methods section: The data collection description provides no specifics on search queries/keywords, time period covered, total volume of posts/comments/threads retrieved, or any inclusion/exclusion criteria applied across the three platforms. These details are load-bearing for evaluating whether the four identified categories are robustly supported or influenced by sampling choices.
[Methods] Thematic analysis description (following BERTopic clustering): No information is given on the number of analysts involved, inter-rater reliability metrics, or the validation process used to derive and confirm the four concern categories from the clusters. This omission weakens the ability to assess the reliability of the synthesis step central to the results.

minor comments (2)

[Abstract] Abstract: Adding approximate figures for the number of discussions analyzed would help readers gauge the scale of the evidence base supporting the four categories.
[Results] Results: The presentation of the four categories would benefit from explicit mapping back to representative quotes or cluster examples to strengthen traceability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for minor revision. We agree that expanding the Methods section with additional details will strengthen the paper's transparency and allow readers to better assess the robustness of our findings.

read point-by-point responses

Referee: [Methods] Methods section: The data collection description provides no specifics on search queries/keywords, time period covered, total volume of posts/comments/threads retrieved, or any inclusion/exclusion criteria applied across the three platforms. These details are load-bearing for evaluating whether the four identified categories are robustly supported or influenced by sampling choices.

Authors: We agree that these details are essential for reproducibility and evaluating sampling choices. In the revised manuscript, we will expand the data collection subsection to specify the search queries and keywords used on each platform (e.g., terms combining 'GitHub Copilot' with 'security', 'data leak', 'licensing', 'prompt injection', and 'insecure code'), the time period covered (from Copilot's public release in 2021 through the collection date), the total volume of posts/comments/threads retrieved before and after filtering, and the inclusion/exclusion criteria (e.g., English-language discussions directly addressing security concerns, exclusion of duplicates or unrelated threads). This will clarify how the four categories were derived from the sampled discussions. revision: yes
Referee: [Methods] Thematic analysis description (following BERTopic clustering): No information is given on the number of analysts involved, inter-rater reliability metrics, or the validation process used to derive and confirm the four concern categories from the clusters. This omission weakens the ability to assess the reliability of the synthesis step central to the results.

Authors: We acknowledge that the current description of the thematic analysis step is insufficiently detailed. In the revision, we will add a description of the process: the involvement of two authors in reviewing BERTopic clusters (via top terms, representative documents, and manual inspection), the iterative synthesis into the four categories through discussion and consensus-building, and the validation approach (cross-referencing with prior literature on GenAI security and checking for consistency across platforms). If inter-rater reliability was not formally computed, we will note the consensus process used instead. This will improve transparency without altering the reported categories. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper performs an exploratory qualitative study: it retrieves external forum threads from Stack Overflow, Reddit, and Hacker News, applies BERTopic clustering, and conducts thematic analysis to surface four concern categories. No equations, fitted parameters, predictions, or derivations exist; the results are direct outputs of documented processing steps on independent data. No self-citation load-bearing steps or ansatz smuggling appear in the provided material. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard qualitative methods and one core domain assumption about the representativeness of public forum data; it introduces no free parameters, invented entities, or additional axioms.

axioms (1)

domain assumption Discussions on public forums like Stack Overflow, Reddit, and Hacker News provide representative insights into developers' security concerns with GenAI coding assistants.
The study bases its findings on these sources without addressing potential biases in who posts or what gets discussed.

pith-pipeline@v0.9.0 · 5585 in / 1327 out tokens · 54002 ms · 2026-05-10T17:15:42.410326+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

[1]

Mousa Al-kfairy, Ahmed Al-Adaileh, and Obsa Sendaba. 2024. ChatGPT Through the Users’ Eyes: Sentiment Analysis of Privacy and Security Issues. InInter- national Symposium on Security and Privacy in Social Networks and Big Data. Springer, 41–67

work page 2024
[2]

Mutahar Ali, Arjun Arunasalam, and Habiba Farrukh. 2025. Understanding Users’ Security and Privacy Concerns and Attitudes Towards Conversational AI Platforms. In2025 IEEE Symposium on Security and Privacy (SP). 298–316

work page 2025
[3]

Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, and Carmine Spagnuolo

work page
[4]

InCompanion Proceedings of the ACM Web Conference 2023

The age of snippet programming: Toward understanding developer com- munities in stack overflow and reddit. InCompanion Proceedings of the ACM Web Conference 2023. 1218–1224

work page 2023
[5]

Leonardo Banh, Florian Holldack, and Gero Strobel. 2025. Copiloting the future: How generative AI transforms Software Engineering.Information and Software Technology183 (2025), 107751

work page 2025
[6]

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert- Voss, Katherine Lee, Adam Roberts, Tom B Brown, Dawn Song, Úlfar Erlingsson, EASE Companion 2026, 9–12 June, 2026, Glasgow, Scotland, United Kingdom Díaz Ferreyra et al. et al. 2022. Extracting Training Data from Large Language Models. InProceedings of the 31st USENIX Securi...

work page 2026
[7]

Amanda Casari, Julia Ferraioli, and Juniper Lovato. 2023. Beyond the repository: Best practices for open source ecosystems researchers.Queue21, 2 (2023), 14–34

work page 2023
[8]

Omkar Sandip Chavan, Divya Dilip Hinge, Soham Sanjay Deo, Yaxuan Wang, and Mohamed Wiem Mkaouer. 2024. Analyzing developer-ChatGPT conversations for software refactoring: an exploratory study. InProceedings of the 21st International Conference on Mining Software Repositories. 207–211

work page 2024
[9]

Mark Chen, Jerry Tworek, et al. 2021. Evaluating Large Language Models Trained on Code. (2021). arXiv:2107.03374

work page internal anchor Pith review Pith/arXiv arXiv 2021
[10]

Zhi Chen and Lingxiao Jiang. 2025. Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Sce- narios. In2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 657–668

work page 2025
[11]

Zadia Codabux, Fatemeh Fard, Roberto Verdecchia, Fabio Palomba, Dario Di Nucci, and Gilberto Recupito. 2024. Teaching Mining Software Reposito- ries. InHandbook on Teaching Empirical Software Engineering. Springer, 325–362

work page 2024
[12]

Domenico Cotroneo, Roberta De Luca, and Pietro Liguori. 2025. DeVAIC: A tool for security assessment of AI-generated code.Information and Software Technology177 (2025), 107572

work page 2025
[13]

Domenico Cotroneo, Cristina Improta, Pietro Liguori, and Roberto Natella. 2024. Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning At- tacks. InProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension. Association for Computing Machinery, 280–292

work page 2024
[14]

Roland Croft, Yongzheng Xie, Mansooreh Zahedi, M Ali Babar, and Christoph Treude. 2022. An empirical study of developers’ discussions about security challenges of different programming languages.Empirical Software Engineering 27, 1 (2022), 27

work page 2022
[15]

Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In2011 international symposium on empirical software engineering and measurement. IEEE, 275–284

work page 2011
[16]

Nicolás E Díaz Ferreyra, Melina Vidoni, Maritta Heisel, and Riccardo Scandariato

work page
[17]

Cybersecurity discussions in Stack Overflow: a developer-centred analysis of engagement and self-disclosure behaviour.Social Network Analysis and Mining 14, 1 (2023), 16

work page 2023
[18]

Mateusz Dolata, Norbert Lange, and Gerhard Schwabe. 2024. Development in times of hype: How freelancers explore Generative AI?. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13

work page 2024
[19]

Christof Ebert and Panos Louridas. 2023. Generative AI for software practitioners. IEEE Software40, 4 (2023), 30–38

work page 2023
[20]

Yujia Fu, Peng Liang, Amjed Tahir, Zengyang Li, Mojtaba Shahin, Jiaxin Yu, and Jinfu Chen. 2025. Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study.ACM Transactions on Software Engineering and Methodologies(2025). Just Accepted

work page 2025
[21]

Ya Gao and GitHub Customer Research. 2024. Quantifying GitHub Copilot’s Impact in the Enterprise with Accenture. Online. https://shorturl.at/OSXr5

work page 2024
[22]

Vladimir Geroimenko. 2025. Key Security Risks in Prompt Engineering. InThe Essential Guide to Prompt Engineering: Key Principles, Techniques, Challenges, and Security Risks. Springer, 103–120

work page 2025
[23]

Nicolas E Gold and Jens Krinke. 2022. Ethics in the mining of software repositories. Empirical Software Engineering27, 1 (2022), 17

work page 2022
[24]

Sivana Hamer, Marcelo d’Amorim, and Laurie Williams. 2024. Just Another Copy and Paste? cComparing the Security Culnerabilities of ChatGPT Generated Code and Stack Overflow Answers. In2024 IEEE Security and Privacy Workshops (SPW). IEEE, 87–94

work page 2024
[25]

Jan H Klemmer, Stefan Albert Horstmann, Nikhil Patnaik, Cordelia Ludden, Cordell Burton Jr, Carson Powers, Fabio Massacci, Akond Rahman, Daniel Votipka, Heather Richter Lipford, et al . 2024. Using ai assistants in software development: A qualitative study on security practices and concerns. InProceed- ings of the 2024 on ACM SIGSAC Conference on Computer...

work page 2024
[26]

Ratanond Koonchanok, Yanling Pan, and Hyeju Jang. 2024. Public attitudes toward chatgpt on twitter: sentiments, topics, and occupations.Social Network Analysis and Mining14, 1 (2024), 106

work page 2024
[27]

Junjie Li, Aseem Sangalay, Cheng Cheng, Yuan Tian, and Jinqiu Yang. 2024. Fine tuning large language model for secure code generation. InProceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering. 86–90

work page 2024
[28]

Ze Shi Li, Nowshin Nawar Arony, Kezia Devathasan, Manish Sihag, Neil Ernst, and Daniela Damian. 2024. Unveiling the life cycle of user feedback: Best practices from software practitioners. InProceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–13

work page 2024
[29]

Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao, Chris Yuhao Liu, Xiaojun Xu, Hang Li, et al. 2025. Rethinking machine unlearning for large language models.Nature Machine Intelligence(2025), 1–14

work page 2025
[30]

Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. 2024. Promsec: Prompt optimization for secure generation of functional source code with large language models (llms). InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2266–2280

work page 2024
[31]

Anh Nguyen-Duc, Beatriz Cabrero-Daniel, Adam Przybylek, Chetan Arora, Dron Khanna, Tomas Herda, Usman Rafiq, Jorge Melegati, Eduardo Guerra, Kai-Kristian Kemell, et al. 2025. Generative artificial intelligence for software engineering—a research agenda.Software: Practice and Experience(2025)

work page 2025
[32]

Liang Niu, Shujaat Mirza, Zayd Maradni, and Christina Pöpper. 2023. CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot. In 32nd USENIX Security Symposium (USENIX Security 23). 2133–2150

work page 2023
[33]

Sahrima Jannat Oishwee, Zadia Codabux, and Natalia Stakhanova. 2024. De- coding android permissions: a study of developer challenges and solutions on stack overflow. InProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 143–153

work page 2024
[34]

Ogobuchi Daniel Okey, Ekikere Umoren Udo, Renata Lopes Rosa, Demostenes Ze- garra Rodríguez, and João Henrique Kleinschmidt. 2023. Investigating ChatGPT and cybersecurity: A perspective on topic modeling and sentiment analysis. Computers & Security135 (2023), 103476

work page 2023
[35]

Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2025. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions.Commun. ACM68, 2 (2025), 96–105

work page 2025
[36]

Anthony Peruma, Steven Simmons, Eman Abdullah AlOmar, Christian D New- man, Mohamed Wiem Mkaouer, and Ali Ouni. 2022. How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow.Empirical Software Engineering27, 1 (2022), 11

work page 2022
[37]

Rafiqul Rabin, Sean McGregor, and Nick Judd. 2025. Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation.arXiv preprint arXiv:2503.22760(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[38]

Raphael Serafini, Asli Yardim, and Alena Naiakshina. 2025. Exploring the Impact of Intervention Methods on Developers’ Security Behavior in a Manipulated ChatGPT Study. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–26

work page 2025
[39]

Trevor Stalnaker, Nathan Wintersgill, Oscar Chaparro, Laura A Heymann, Mas- similiano Di Penta, Daniel M German, and Denys Poshyvanyk. 2024. Developer Perspectives on Licensing and Copyright Issues Arising from Generative AI for Software Development.ACM Transactions on Software Engineering and Method- ology(2024)

work page 2024
[40]

Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato

Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato. 2025. Prompting Techniques for Secure Code Generation: A Systematic Investigation.ACM Transactions on Software Engineering and Methodology(2025). doi:10.1145/3722108

work page doi:10.1145/3722108 2025
[41]

In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp

Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, and Riccardo Scandari- ato. 2023. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations. InProceedings of the 20th International Conference on Mining Software Repositories (MSR ’23). doi:10.1109/MSR59073.2023.00084

work page doi:10.1109/msr59073.2023.00084 2023
[42]

Weibin Wu, Haoxuan Hu, Zhaoji Fan, Yitong Qiao, Yizhan Huang, Yichen Li, Zibin Zheng, and Michael Lyu. 2025. An Empirical Study of Code Clones from Commercial AI Code Generators.Proceedings of the ACM on Software Engineering 2, FSE (2025), 2874–2896

work page 2025
[43]

Mei Wu-Gehbauer and Christoph Rosenkranz. 2024. Unlocking the Potential of Generative Artificial Intelligence: A Case Study in Software Development. In Proceedings of the International Conference on Information Systems (ICIS 2024) (ICIS 2024 Proceedings, 25). Association for Information Systems

work page 2024
[44]

HanXiang Xu, ShenAo Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, and HaoYu Wang. 2024. Large language models for cyber se- curity: A systematic literature review.ACM Transactions on Software Engineering and Methodology(2024)

work page 2024
[45]

Weiwei Xu, Kai Gao, Hao He, and Minghui Zhou. 2025. Licoeval: Evaluating LLMs on License Compliance in Code Generation. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 1665–1677

work page 2025
[46]

Zhaoxiang Xu, Qingguo Fang, Yanbo Huang, and Mingjian Xie. 2024. The public attitude towards ChatGPT on reddit: A study based on unsupervised learning from sentiment analysis and topic modeling.Plos one19, 5 (2024), e0302502

work page 2024
[47]

Zhou Yang, Zhipeng Zhao, Chenyu Wang, Jieke Shi, Dongsun Kim, Donggyun Han, and David Lo. 2024. Unveiling memorization in code models. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13

work page 2024
[48]

Burak Yetistiren, Isik Ozsoy, and Eray Tuzun. 2022. Assessing the quality of GitHub copilot’s code generation. InProceedings of the 18th international confer- ence on predictive models and data analytics in software engineering. 62–71

work page 2022
[49]

Aria Zegers, Natalie Preciado, Jan Duchnowski, Fernanda Madeiral, and Emitzá Guzmán. 2025. Irresponsibility Killed the Cat: Software Accountability Concerns. In2025 IEEE/ACM 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE). IEEE, 131–142

work page 2025

[1] [1]

Mousa Al-kfairy, Ahmed Al-Adaileh, and Obsa Sendaba. 2024. ChatGPT Through the Users’ Eyes: Sentiment Analysis of Privacy and Security Issues. InInter- national Symposium on Security and Privacy in Social Networks and Big Data. Springer, 41–67

work page 2024

[2] [2]

Mutahar Ali, Arjun Arunasalam, and Habiba Farrukh. 2025. Understanding Users’ Security and Privacy Concerns and Attitudes Towards Conversational AI Platforms. In2025 IEEE Symposium on Security and Privacy (SP). 298–316

work page 2025

[3] [3]

Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, and Carmine Spagnuolo

work page

[4] [4]

InCompanion Proceedings of the ACM Web Conference 2023

The age of snippet programming: Toward understanding developer com- munities in stack overflow and reddit. InCompanion Proceedings of the ACM Web Conference 2023. 1218–1224

work page 2023

[5] [5]

Leonardo Banh, Florian Holldack, and Gero Strobel. 2025. Copiloting the future: How generative AI transforms Software Engineering.Information and Software Technology183 (2025), 107751

work page 2025

[6] [6]

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert- Voss, Katherine Lee, Adam Roberts, Tom B Brown, Dawn Song, Úlfar Erlingsson, EASE Companion 2026, 9–12 June, 2026, Glasgow, Scotland, United Kingdom Díaz Ferreyra et al. et al. 2022. Extracting Training Data from Large Language Models. InProceedings of the 31st USENIX Securi...

work page 2026

[7] [7]

Amanda Casari, Julia Ferraioli, and Juniper Lovato. 2023. Beyond the repository: Best practices for open source ecosystems researchers.Queue21, 2 (2023), 14–34

work page 2023

[8] [8]

Omkar Sandip Chavan, Divya Dilip Hinge, Soham Sanjay Deo, Yaxuan Wang, and Mohamed Wiem Mkaouer. 2024. Analyzing developer-ChatGPT conversations for software refactoring: an exploratory study. InProceedings of the 21st International Conference on Mining Software Repositories. 207–211

work page 2024

[9] [9]

Mark Chen, Jerry Tworek, et al. 2021. Evaluating Large Language Models Trained on Code. (2021). arXiv:2107.03374

work page internal anchor Pith review Pith/arXiv arXiv 2021

[10] [10]

Zhi Chen and Lingxiao Jiang. 2025. Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Sce- narios. In2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 657–668

work page 2025

[11] [11]

Zadia Codabux, Fatemeh Fard, Roberto Verdecchia, Fabio Palomba, Dario Di Nucci, and Gilberto Recupito. 2024. Teaching Mining Software Reposito- ries. InHandbook on Teaching Empirical Software Engineering. Springer, 325–362

work page 2024

[12] [12]

Domenico Cotroneo, Roberta De Luca, and Pietro Liguori. 2025. DeVAIC: A tool for security assessment of AI-generated code.Information and Software Technology177 (2025), 107572

work page 2025

[13] [13]

Domenico Cotroneo, Cristina Improta, Pietro Liguori, and Roberto Natella. 2024. Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning At- tacks. InProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension. Association for Computing Machinery, 280–292

work page 2024

[14] [14]

Roland Croft, Yongzheng Xie, Mansooreh Zahedi, M Ali Babar, and Christoph Treude. 2022. An empirical study of developers’ discussions about security challenges of different programming languages.Empirical Software Engineering 27, 1 (2022), 27

work page 2022

[15] [15]

Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In2011 international symposium on empirical software engineering and measurement. IEEE, 275–284

work page 2011

[16] [16]

Nicolás E Díaz Ferreyra, Melina Vidoni, Maritta Heisel, and Riccardo Scandariato

work page

[17] [17]

Cybersecurity discussions in Stack Overflow: a developer-centred analysis of engagement and self-disclosure behaviour.Social Network Analysis and Mining 14, 1 (2023), 16

work page 2023

[18] [18]

Mateusz Dolata, Norbert Lange, and Gerhard Schwabe. 2024. Development in times of hype: How freelancers explore Generative AI?. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13

work page 2024

[19] [19]

Christof Ebert and Panos Louridas. 2023. Generative AI for software practitioners. IEEE Software40, 4 (2023), 30–38

work page 2023

[20] [20]

Yujia Fu, Peng Liang, Amjed Tahir, Zengyang Li, Mojtaba Shahin, Jiaxin Yu, and Jinfu Chen. 2025. Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study.ACM Transactions on Software Engineering and Methodologies(2025). Just Accepted

work page 2025

[21] [21]

Ya Gao and GitHub Customer Research. 2024. Quantifying GitHub Copilot’s Impact in the Enterprise with Accenture. Online. https://shorturl.at/OSXr5

work page 2024

[22] [22]

Vladimir Geroimenko. 2025. Key Security Risks in Prompt Engineering. InThe Essential Guide to Prompt Engineering: Key Principles, Techniques, Challenges, and Security Risks. Springer, 103–120

work page 2025

[23] [23]

Nicolas E Gold and Jens Krinke. 2022. Ethics in the mining of software repositories. Empirical Software Engineering27, 1 (2022), 17

work page 2022

[24] [24]

Sivana Hamer, Marcelo d’Amorim, and Laurie Williams. 2024. Just Another Copy and Paste? cComparing the Security Culnerabilities of ChatGPT Generated Code and Stack Overflow Answers. In2024 IEEE Security and Privacy Workshops (SPW). IEEE, 87–94

work page 2024

[25] [25]

Jan H Klemmer, Stefan Albert Horstmann, Nikhil Patnaik, Cordelia Ludden, Cordell Burton Jr, Carson Powers, Fabio Massacci, Akond Rahman, Daniel Votipka, Heather Richter Lipford, et al . 2024. Using ai assistants in software development: A qualitative study on security practices and concerns. InProceed- ings of the 2024 on ACM SIGSAC Conference on Computer...

work page 2024

[26] [26]

Ratanond Koonchanok, Yanling Pan, and Hyeju Jang. 2024. Public attitudes toward chatgpt on twitter: sentiments, topics, and occupations.Social Network Analysis and Mining14, 1 (2024), 106

work page 2024

[27] [27]

Junjie Li, Aseem Sangalay, Cheng Cheng, Yuan Tian, and Jinqiu Yang. 2024. Fine tuning large language model for secure code generation. InProceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering. 86–90

work page 2024

[28] [28]

Ze Shi Li, Nowshin Nawar Arony, Kezia Devathasan, Manish Sihag, Neil Ernst, and Daniela Damian. 2024. Unveiling the life cycle of user feedback: Best practices from software practitioners. InProceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–13

work page 2024

[29] [29]

Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao, Chris Yuhao Liu, Xiaojun Xu, Hang Li, et al. 2025. Rethinking machine unlearning for large language models.Nature Machine Intelligence(2025), 1–14

work page 2025

[30] [30]

Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. 2024. Promsec: Prompt optimization for secure generation of functional source code with large language models (llms). InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2266–2280

work page 2024

[31] [31]

Anh Nguyen-Duc, Beatriz Cabrero-Daniel, Adam Przybylek, Chetan Arora, Dron Khanna, Tomas Herda, Usman Rafiq, Jorge Melegati, Eduardo Guerra, Kai-Kristian Kemell, et al. 2025. Generative artificial intelligence for software engineering—a research agenda.Software: Practice and Experience(2025)

work page 2025

[32] [32]

Liang Niu, Shujaat Mirza, Zayd Maradni, and Christina Pöpper. 2023. CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot. In 32nd USENIX Security Symposium (USENIX Security 23). 2133–2150

work page 2023

[33] [33]

Sahrima Jannat Oishwee, Zadia Codabux, and Natalia Stakhanova. 2024. De- coding android permissions: a study of developer challenges and solutions on stack overflow. InProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 143–153

work page 2024

[34] [34]

Ogobuchi Daniel Okey, Ekikere Umoren Udo, Renata Lopes Rosa, Demostenes Ze- garra Rodríguez, and João Henrique Kleinschmidt. 2023. Investigating ChatGPT and cybersecurity: A perspective on topic modeling and sentiment analysis. Computers & Security135 (2023), 103476

work page 2023

[35] [35]

Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2025. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions.Commun. ACM68, 2 (2025), 96–105

work page 2025

[36] [36]

Anthony Peruma, Steven Simmons, Eman Abdullah AlOmar, Christian D New- man, Mohamed Wiem Mkaouer, and Ali Ouni. 2022. How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow.Empirical Software Engineering27, 1 (2022), 11

work page 2022

[37] [37]

Rafiqul Rabin, Sean McGregor, and Nick Judd. 2025. Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation.arXiv preprint arXiv:2503.22760(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[38] [38]

Raphael Serafini, Asli Yardim, and Alena Naiakshina. 2025. Exploring the Impact of Intervention Methods on Developers’ Security Behavior in a Manipulated ChatGPT Study. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–26

work page 2025

[39] [39]

Trevor Stalnaker, Nathan Wintersgill, Oscar Chaparro, Laura A Heymann, Mas- similiano Di Penta, Daniel M German, and Denys Poshyvanyk. 2024. Developer Perspectives on Licensing and Copyright Issues Arising from Generative AI for Software Development.ACM Transactions on Software Engineering and Method- ology(2024)

work page 2024

[40] [40]

Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato

Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato. 2025. Prompting Techniques for Secure Code Generation: A Systematic Investigation.ACM Transactions on Software Engineering and Methodology(2025). doi:10.1145/3722108

work page doi:10.1145/3722108 2025

[41] [41]

In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp

Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, and Riccardo Scandari- ato. 2023. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations. InProceedings of the 20th International Conference on Mining Software Repositories (MSR ’23). doi:10.1109/MSR59073.2023.00084

work page doi:10.1109/msr59073.2023.00084 2023

[42] [42]

Weibin Wu, Haoxuan Hu, Zhaoji Fan, Yitong Qiao, Yizhan Huang, Yichen Li, Zibin Zheng, and Michael Lyu. 2025. An Empirical Study of Code Clones from Commercial AI Code Generators.Proceedings of the ACM on Software Engineering 2, FSE (2025), 2874–2896

work page 2025

[43] [43]

Mei Wu-Gehbauer and Christoph Rosenkranz. 2024. Unlocking the Potential of Generative Artificial Intelligence: A Case Study in Software Development. In Proceedings of the International Conference on Information Systems (ICIS 2024) (ICIS 2024 Proceedings, 25). Association for Information Systems

work page 2024

[44] [44]

HanXiang Xu, ShenAo Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, and HaoYu Wang. 2024. Large language models for cyber se- curity: A systematic literature review.ACM Transactions on Software Engineering and Methodology(2024)

work page 2024

[45] [45]

Weiwei Xu, Kai Gao, Hao He, and Minghui Zhou. 2025. Licoeval: Evaluating LLMs on License Compliance in Code Generation. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 1665–1677

work page 2025

[46] [46]

Zhaoxiang Xu, Qingguo Fang, Yanbo Huang, and Mingjian Xie. 2024. The public attitude towards ChatGPT on reddit: A study based on unsupervised learning from sentiment analysis and topic modeling.Plos one19, 5 (2024), e0302502

work page 2024

[47] [47]

Zhou Yang, Zhipeng Zhao, Chenyu Wang, Jieke Shi, Dongsun Kim, Donggyun Han, and David Lo. 2024. Unveiling memorization in code models. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13

work page 2024

[48] [48]

Burak Yetistiren, Isik Ozsoy, and Eray Tuzun. 2022. Assessing the quality of GitHub copilot’s code generation. InProceedings of the 18th international confer- ence on predictive models and data analytics in software engineering. 62–71

work page 2022

[49] [49]

Aria Zegers, Natalie Preciado, Jan Duchnowski, Fernanda Madeiral, and Emitzá Guzmán. 2025. Irresponsibility Killed the Cat: Software Accountability Concerns. In2025 IEEE/ACM 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE). IEEE, 131–142

work page 2025