Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot
Pith reviewed 2026-05-10 17:15 UTC · model grok-4.3
The pith
Analysis of online developer discussions identifies four primary security concerns with generative AI coding assistants.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through the collection of discussion threads from Stack Overflow, Reddit, and Hacker News concerning security issues in GitHub Copilot, followed by BERTopic clustering and thematic analysis, four major areas of concern emerge: potential data leakage, code licensing, adversarial attacks such as prompt injection, and insecure code suggestions. These findings emphasize the limitations and trade-offs involved in applying generative AI to software engineering tasks.
What carries the argument
BERTopic clustering followed by thematic analysis of developer discussion threads on GitHub Copilot security issues.
Load-bearing premise
The sample of discussions from Stack Overflow, Reddit, and Hacker News accurately reflects the security concerns of software developers in general.
What would settle it
A broad survey of professional developers reporting few or no security concerns with generative AI coding assistants would challenge whether the four areas represent widespread views.
Figures
read the original abstract
Generative Artificial Intelligence (GenAI) has become a central component of many development tools (e.g., GitHub Copilot) that support software practitioners across multiple programming tasks, including code completion, documentation, and bug detection. However, current research has identified significant limitations and open issues in GenAI, including reliability, non-determinism, bias, and copyright infringement. While prior work has primarily focused on assessing the technical performance of these technologies for code generation, less attention has been paid to emerging concerns of software developers, particularly in the security realm. OBJECTIVE: This work explores security concerns regarding the use of GenAI-based coding assistants by analyzing challenges voiced by developers and software enthusiasts in public online forums. METHOD: We retrieved posts, comments, and discussion threads addressing security issues in GitHub Copilot from three popular platforms, namely Stack Overflow, Reddit, and Hacker News. These discussions were clustered using BERTopic and then synthesized using thematic analysis to identify distinct categories of security concerns. RESULTS: Four major concern areas were identified, including potential data leakage, code licensing, adversarial attacks (e.g., prompt injection), and insecure code suggestions, underscoring critical reflections on the limitations and trade-offs of GenAI in software engineering. IMPLICATIONS: Our findings contribute to a broader understanding of how developers perceive and engage with GenAI-based coding assistants, while highlighting key areas for improving their built-in security features.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper explores security concerns regarding generative AI coding assistants such as GitHub Copilot by retrieving and analyzing public discussions from Stack Overflow, Reddit, and Hacker News. It applies BERTopic clustering to the collected threads followed by thematic analysis to synthesize four major concern categories: potential data leakage, code licensing issues, adversarial attacks (e.g., prompt injection), and insecure code suggestions. The work positions these findings as insights into developer perceptions and trade-offs in GenAI adoption for software engineering tasks.
Significance. If the results hold, the study adds a developer-centered perspective to the literature on GenAI limitations in software engineering, complementing technical evaluations of reliability, bias, and performance. By drawing directly from forum discussions rather than controlled experiments, it highlights practical security worries that could inform tool improvements and future research on human-AI collaboration in coding.
major comments (2)
- [Methods] Methods section: The data collection description provides no specifics on search queries/keywords, time period covered, total volume of posts/comments/threads retrieved, or any inclusion/exclusion criteria applied across the three platforms. These details are load-bearing for evaluating whether the four identified categories are robustly supported or influenced by sampling choices.
- [Methods] Thematic analysis description (following BERTopic clustering): No information is given on the number of analysts involved, inter-rater reliability metrics, or the validation process used to derive and confirm the four concern categories from the clusters. This omission weakens the ability to assess the reliability of the synthesis step central to the results.
minor comments (2)
- [Abstract] Abstract: Adding approximate figures for the number of discussions analyzed would help readers gauge the scale of the evidence base supporting the four categories.
- [Results] Results: The presentation of the four categories would benefit from explicit mapping back to representative quotes or cluster examples to strengthen traceability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and the recommendation for minor revision. We agree that expanding the Methods section with additional details will strengthen the paper's transparency and allow readers to better assess the robustness of our findings.
read point-by-point responses
-
Referee: [Methods] Methods section: The data collection description provides no specifics on search queries/keywords, time period covered, total volume of posts/comments/threads retrieved, or any inclusion/exclusion criteria applied across the three platforms. These details are load-bearing for evaluating whether the four identified categories are robustly supported or influenced by sampling choices.
Authors: We agree that these details are essential for reproducibility and evaluating sampling choices. In the revised manuscript, we will expand the data collection subsection to specify the search queries and keywords used on each platform (e.g., terms combining 'GitHub Copilot' with 'security', 'data leak', 'licensing', 'prompt injection', and 'insecure code'), the time period covered (from Copilot's public release in 2021 through the collection date), the total volume of posts/comments/threads retrieved before and after filtering, and the inclusion/exclusion criteria (e.g., English-language discussions directly addressing security concerns, exclusion of duplicates or unrelated threads). This will clarify how the four categories were derived from the sampled discussions. revision: yes
-
Referee: [Methods] Thematic analysis description (following BERTopic clustering): No information is given on the number of analysts involved, inter-rater reliability metrics, or the validation process used to derive and confirm the four concern categories from the clusters. This omission weakens the ability to assess the reliability of the synthesis step central to the results.
Authors: We acknowledge that the current description of the thematic analysis step is insufficiently detailed. In the revision, we will add a description of the process: the involvement of two authors in reviewing BERTopic clusters (via top terms, representative documents, and manual inspection), the iterative synthesis into the four categories through discussion and consensus-building, and the validation approach (cross-referencing with prior literature on GenAI security and checking for consistency across platforms). If inter-rater reliability was not formally computed, we will note the consensus process used instead. This will improve transparency without altering the reported categories. revision: yes
Circularity Check
No significant circularity
full rationale
The paper performs an exploratory qualitative study: it retrieves external forum threads from Stack Overflow, Reddit, and Hacker News, applies BERTopic clustering, and conducts thematic analysis to surface four concern categories. No equations, fitted parameters, predictions, or derivations exist; the results are direct outputs of documented processing steps on independent data. No self-citation load-bearing steps or ansatz smuggling appear in the provided material. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Discussions on public forums like Stack Overflow, Reddit, and Hacker News provide representative insights into developers' security concerns with GenAI coding assistants.
Reference graph
Works this paper leans on
-
[1]
Mousa Al-kfairy, Ahmed Al-Adaileh, and Obsa Sendaba. 2024. ChatGPT Through the Users’ Eyes: Sentiment Analysis of Privacy and Security Issues. InInter- national Symposium on Security and Privacy in Social Networks and Big Data. Springer, 41–67
work page 2024
-
[2]
Mutahar Ali, Arjun Arunasalam, and Habiba Farrukh. 2025. Understanding Users’ Security and Privacy Concerns and Attitudes Towards Conversational AI Platforms. In2025 IEEE Symposium on Security and Privacy (SP). 298–316
work page 2025
-
[3]
Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, and Carmine Spagnuolo
-
[4]
InCompanion Proceedings of the ACM Web Conference 2023
The age of snippet programming: Toward understanding developer com- munities in stack overflow and reddit. InCompanion Proceedings of the ACM Web Conference 2023. 1218–1224
work page 2023
-
[5]
Leonardo Banh, Florian Holldack, and Gero Strobel. 2025. Copiloting the future: How generative AI transforms Software Engineering.Information and Software Technology183 (2025), 107751
work page 2025
-
[6]
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert- Voss, Katherine Lee, Adam Roberts, Tom B Brown, Dawn Song, Úlfar Erlingsson, EASE Companion 2026, 9–12 June, 2026, Glasgow, Scotland, United Kingdom Díaz Ferreyra et al. et al. 2022. Extracting Training Data from Large Language Models. InProceedings of the 31st USENIX Securi...
work page 2026
-
[7]
Amanda Casari, Julia Ferraioli, and Juniper Lovato. 2023. Beyond the repository: Best practices for open source ecosystems researchers.Queue21, 2 (2023), 14–34
work page 2023
-
[8]
Omkar Sandip Chavan, Divya Dilip Hinge, Soham Sanjay Deo, Yaxuan Wang, and Mohamed Wiem Mkaouer. 2024. Analyzing developer-ChatGPT conversations for software refactoring: an exploratory study. InProceedings of the 21st International Conference on Mining Software Repositories. 207–211
work page 2024
-
[9]
Mark Chen, Jerry Tworek, et al. 2021. Evaluating Large Language Models Trained on Code. (2021). arXiv:2107.03374
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[10]
Zhi Chen and Lingxiao Jiang. 2025. Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Sce- narios. In2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 657–668
work page 2025
-
[11]
Zadia Codabux, Fatemeh Fard, Roberto Verdecchia, Fabio Palomba, Dario Di Nucci, and Gilberto Recupito. 2024. Teaching Mining Software Reposito- ries. InHandbook on Teaching Empirical Software Engineering. Springer, 325–362
work page 2024
-
[12]
Domenico Cotroneo, Roberta De Luca, and Pietro Liguori. 2025. DeVAIC: A tool for security assessment of AI-generated code.Information and Software Technology177 (2025), 107572
work page 2025
-
[13]
Domenico Cotroneo, Cristina Improta, Pietro Liguori, and Roberto Natella. 2024. Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning At- tacks. InProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension. Association for Computing Machinery, 280–292
work page 2024
-
[14]
Roland Croft, Yongzheng Xie, Mansooreh Zahedi, M Ali Babar, and Christoph Treude. 2022. An empirical study of developers’ discussions about security challenges of different programming languages.Empirical Software Engineering 27, 1 (2022), 27
work page 2022
-
[15]
Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In2011 international symposium on empirical software engineering and measurement. IEEE, 275–284
work page 2011
-
[16]
Nicolás E Díaz Ferreyra, Melina Vidoni, Maritta Heisel, and Riccardo Scandariato
-
[17]
Cybersecurity discussions in Stack Overflow: a developer-centred analysis of engagement and self-disclosure behaviour.Social Network Analysis and Mining 14, 1 (2023), 16
work page 2023
-
[18]
Mateusz Dolata, Norbert Lange, and Gerhard Schwabe. 2024. Development in times of hype: How freelancers explore Generative AI?. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13
work page 2024
-
[19]
Christof Ebert and Panos Louridas. 2023. Generative AI for software practitioners. IEEE Software40, 4 (2023), 30–38
work page 2023
-
[20]
Yujia Fu, Peng Liang, Amjed Tahir, Zengyang Li, Mojtaba Shahin, Jiaxin Yu, and Jinfu Chen. 2025. Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study.ACM Transactions on Software Engineering and Methodologies(2025). Just Accepted
work page 2025
-
[21]
Ya Gao and GitHub Customer Research. 2024. Quantifying GitHub Copilot’s Impact in the Enterprise with Accenture. Online. https://shorturl.at/OSXr5
work page 2024
-
[22]
Vladimir Geroimenko. 2025. Key Security Risks in Prompt Engineering. InThe Essential Guide to Prompt Engineering: Key Principles, Techniques, Challenges, and Security Risks. Springer, 103–120
work page 2025
-
[23]
Nicolas E Gold and Jens Krinke. 2022. Ethics in the mining of software repositories. Empirical Software Engineering27, 1 (2022), 17
work page 2022
-
[24]
Sivana Hamer, Marcelo d’Amorim, and Laurie Williams. 2024. Just Another Copy and Paste? cComparing the Security Culnerabilities of ChatGPT Generated Code and Stack Overflow Answers. In2024 IEEE Security and Privacy Workshops (SPW). IEEE, 87–94
work page 2024
-
[25]
Jan H Klemmer, Stefan Albert Horstmann, Nikhil Patnaik, Cordelia Ludden, Cordell Burton Jr, Carson Powers, Fabio Massacci, Akond Rahman, Daniel Votipka, Heather Richter Lipford, et al . 2024. Using ai assistants in software development: A qualitative study on security practices and concerns. InProceed- ings of the 2024 on ACM SIGSAC Conference on Computer...
work page 2024
-
[26]
Ratanond Koonchanok, Yanling Pan, and Hyeju Jang. 2024. Public attitudes toward chatgpt on twitter: sentiments, topics, and occupations.Social Network Analysis and Mining14, 1 (2024), 106
work page 2024
-
[27]
Junjie Li, Aseem Sangalay, Cheng Cheng, Yuan Tian, and Jinqiu Yang. 2024. Fine tuning large language model for secure code generation. InProceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering. 86–90
work page 2024
-
[28]
Ze Shi Li, Nowshin Nawar Arony, Kezia Devathasan, Manish Sihag, Neil Ernst, and Daniela Damian. 2024. Unveiling the life cycle of user feedback: Best practices from software practitioners. InProceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–13
work page 2024
-
[29]
Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao, Chris Yuhao Liu, Xiaojun Xu, Hang Li, et al. 2025. Rethinking machine unlearning for large language models.Nature Machine Intelligence(2025), 1–14
work page 2025
-
[30]
Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. 2024. Promsec: Prompt optimization for secure generation of functional source code with large language models (llms). InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2266–2280
work page 2024
-
[31]
Anh Nguyen-Duc, Beatriz Cabrero-Daniel, Adam Przybylek, Chetan Arora, Dron Khanna, Tomas Herda, Usman Rafiq, Jorge Melegati, Eduardo Guerra, Kai-Kristian Kemell, et al. 2025. Generative artificial intelligence for software engineering—a research agenda.Software: Practice and Experience(2025)
work page 2025
-
[32]
Liang Niu, Shujaat Mirza, Zayd Maradni, and Christina Pöpper. 2023. CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot. In 32nd USENIX Security Symposium (USENIX Security 23). 2133–2150
work page 2023
-
[33]
Sahrima Jannat Oishwee, Zadia Codabux, and Natalia Stakhanova. 2024. De- coding android permissions: a study of developer challenges and solutions on stack overflow. InProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 143–153
work page 2024
-
[34]
Ogobuchi Daniel Okey, Ekikere Umoren Udo, Renata Lopes Rosa, Demostenes Ze- garra Rodríguez, and João Henrique Kleinschmidt. 2023. Investigating ChatGPT and cybersecurity: A perspective on topic modeling and sentiment analysis. Computers & Security135 (2023), 103476
work page 2023
-
[35]
Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2025. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions.Commun. ACM68, 2 (2025), 96–105
work page 2025
-
[36]
Anthony Peruma, Steven Simmons, Eman Abdullah AlOmar, Christian D New- man, Mohamed Wiem Mkaouer, and Ali Ouni. 2022. How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow.Empirical Software Engineering27, 1 (2022), 11
work page 2022
-
[37]
Rafiqul Rabin, Sean McGregor, and Nick Judd. 2025. Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation.arXiv preprint arXiv:2503.22760(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
Raphael Serafini, Asli Yardim, and Alena Naiakshina. 2025. Exploring the Impact of Intervention Methods on Developers’ Security Behavior in a Manipulated ChatGPT Study. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–26
work page 2025
-
[39]
Trevor Stalnaker, Nathan Wintersgill, Oscar Chaparro, Laura A Heymann, Mas- similiano Di Penta, Daniel M German, and Denys Poshyvanyk. 2024. Developer Perspectives on Licensing and Copyright Issues Arising from Generative AI for Software Development.ACM Transactions on Software Engineering and Method- ology(2024)
work page 2024
-
[40]
Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato
Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, and Ric- cardo Scandariato. 2025. Prompting Techniques for Secure Code Generation: A Systematic Investigation.ACM Transactions on Software Engineering and Methodology(2025). doi:10.1145/3722108
-
[41]
Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, and Riccardo Scandari- ato. 2023. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations. InProceedings of the 20th International Conference on Mining Software Repositories (MSR ’23). doi:10.1109/MSR59073.2023.00084
-
[42]
Weibin Wu, Haoxuan Hu, Zhaoji Fan, Yitong Qiao, Yizhan Huang, Yichen Li, Zibin Zheng, and Michael Lyu. 2025. An Empirical Study of Code Clones from Commercial AI Code Generators.Proceedings of the ACM on Software Engineering 2, FSE (2025), 2874–2896
work page 2025
-
[43]
Mei Wu-Gehbauer and Christoph Rosenkranz. 2024. Unlocking the Potential of Generative Artificial Intelligence: A Case Study in Software Development. In Proceedings of the International Conference on Information Systems (ICIS 2024) (ICIS 2024 Proceedings, 25). Association for Information Systems
work page 2024
-
[44]
HanXiang Xu, ShenAo Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, and HaoYu Wang. 2024. Large language models for cyber se- curity: A systematic literature review.ACM Transactions on Software Engineering and Methodology(2024)
work page 2024
-
[45]
Weiwei Xu, Kai Gao, Hao He, and Minghui Zhou. 2025. Licoeval: Evaluating LLMs on License Compliance in Code Generation. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 1665–1677
work page 2025
-
[46]
Zhaoxiang Xu, Qingguo Fang, Yanbo Huang, and Mingjian Xie. 2024. The public attitude towards ChatGPT on reddit: A study based on unsupervised learning from sentiment analysis and topic modeling.Plos one19, 5 (2024), e0302502
work page 2024
-
[47]
Zhou Yang, Zhipeng Zhao, Chenyu Wang, Jieke Shi, Dongsun Kim, Donggyun Han, and David Lo. 2024. Unveiling memorization in code models. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13
work page 2024
-
[48]
Burak Yetistiren, Isik Ozsoy, and Eray Tuzun. 2022. Assessing the quality of GitHub copilot’s code generation. InProceedings of the 18th international confer- ence on predictive models and data analytics in software engineering. 62–71
work page 2022
-
[49]
Aria Zegers, Natalie Preciado, Jan Duchnowski, Fernanda Madeiral, and Emitzá Guzmán. 2025. Irresponsibility Killed the Cat: Software Accountability Concerns. In2025 IEEE/ACM 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE). IEEE, 131–142
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.