The Role of Online Forums in Developer Understanding of Privacy Law -- A Reddit Case Study
Pith reviewed 2026-06-30 07:23 UTC · model grok-4.3
The pith
Software developers consult Reddit for privacy law advice even after earning certifications, especially on data protection assessments, breach reporting, and cookie consent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Despite holding privacy-related certifications, most participants frequently use forums to seek legal advice, with key challenges in implementing a data protection impact assessment, reporting a data breach, and obtaining cookie consent; users assess credibility by reviewing respondents' post history, verifying sources, trusting recognized experts, and seeking clarification.
What carries the argument
Survey of self-selected Reddit users combined with qualitative analysis of posts in regulatory-focused subreddits, used to map advice-seeking behavior and credibility evaluation methods.
Load-bearing premise
The 223 survey respondents and 2,248 posts from regulatory subreddits accurately represent how software practitioners seek and apply privacy law guidance in practice.
What would settle it
A follow-up study that directly observes privacy compliance decisions in real development projects and compares error rates between frequent forum users and those who rely only on official sources or training.
Figures
read the original abstract
Software practitioners use online forums to navigate complex and often ambiguous legal privacy requirements, yet little is known about their professional backgrounds, what challenges they face, and how they use and assess the credibility of the advice received, or how they resolve ambiguities in posts. We report the findings of a survey of 223 Reddit users from regulatory-focused subreddits, complemented by a qualitative analysis of 2,248 posts and responses. Our results show that, despite holding privacy-related certifications, most participants frequently use forums to seek legal advice. Key challenges reported or identified include implementing a data protection impact assessment, reporting a data breach, and obtaining cookie consent. Reddit users often assess credibility by reviewing respondents' post history, verifying sources cited, trusting advice from recognized experts, and following up for clarity before responding. We highlight research and educational directions to bridge gaps in support needed for regulatory compliance guidance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reports findings from a survey of 223 Reddit users recruited from regulatory-focused subreddits, complemented by qualitative analysis of 2,248 posts and responses. It claims that software practitioners frequently use online forums to seek privacy law advice even when they hold relevant certifications, identifies specific challenges including implementing data protection impact assessments, reporting data breaches, and obtaining cookie consent, describes credibility assessment practices such as reviewing post history and verifying sources, and suggests research and educational directions to address gaps in regulatory compliance support.
Significance. If the reported patterns hold, the work would offer empirical insight into how developers navigate privacy regulations through informal channels, underscoring the limits of certifications alone and pointing to concrete areas where additional guidance is needed. The mixed-methods approach combining survey data with post analysis is a strength for identifying both self-reported behaviors and observable discussion themes.
major comments (2)
- [Methods] Methods section (implied by abstract description of survey and qualitative analysis): The manuscript provides sample sizes (223 survey respondents, 2,248 posts) but omits details on sampling method, response rate, exclusion criteria, recruitment period, and inter-coder reliability for the qualitative coding. These omissions are load-bearing because the central claims about frequency of forum use and identification of specific challenges rest on the validity and representativeness of the collected data.
- [Recruitment and Sampling] Recruitment description (abstract and results): Recruitment targeted regulatory-focused subreddits, which conditions the sample on pre-existing privacy interest and forum engagement. This selection effect directly affects the strongest claim that 'most participants frequently use forums to seek legal advice' and the reported challenges, as the observed patterns may not generalize to software practitioners outside these communities or indicate whether certifications reduce forum reliance in the broader population.
minor comments (1)
- [Abstract] Abstract: The credibility assessment practices are listed but without indication of how many participants reported each method or any quantification of their prevalence.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We respond to each major comment below and will revise the paper to address the identified issues where possible.
read point-by-point responses
-
Referee: [Methods] Methods section (implied by abstract description of survey and qualitative analysis): The manuscript provides sample sizes (223 survey respondents, 2,248 posts) but omits details on sampling method, response rate, exclusion criteria, recruitment period, and inter-coder reliability for the qualitative coding. These omissions are load-bearing because the central claims about frequency of forum use and identification of specific challenges rest on the validity and representativeness of the collected data.
Authors: We agree that the Methods section requires expansion for transparency. In the revised manuscript we will add: the sampling approach (posts in regulatory-focused subreddits with a link to an anonymous survey), the recruitment period, explicit exclusion criteria (e.g., incomplete or duplicate responses), and a description of the qualitative coding process. For inter-coder reliability we will report the procedure used and any agreement metrics calculated. A precise response rate cannot be computed because the survey was posted publicly without individual invitation tracking; we will state this limitation explicitly and note that the sample is best understood as a convenience sample of engaged subreddit users. revision: yes
-
Referee: [Recruitment and Sampling] Recruitment description (abstract and results): Recruitment targeted regulatory-focused subreddits, which conditions the sample on pre-existing privacy interest and forum engagement. This selection effect directly affects the strongest claim that 'most participants frequently use forums to seek legal advice' and the reported challenges, as the observed patterns may not generalize to software practitioners outside these communities or indicate whether certifications reduce forum reliance in the broader population.
Authors: We acknowledge the selection effect and agree it constrains generalizability. The study was designed as a case study of forum users who already engage with privacy topics; the central finding is that even within this group, certifications do not eliminate reliance on forums. In revision we will (1) qualify all frequency and challenge claims to refer to this population, (2) add an explicit Limitations subsection discussing the selection bias and its implications for inferences about the broader practitioner population, and (3) temper language suggesting the results speak to whether certifications reduce forum use overall. We will not claim broader representativeness. revision: partial
Circularity Check
Empirical survey with no derivations, fits, or self-referential steps
full rationale
The paper is a standard empirical survey and qualitative analysis of 223 Reddit respondents and 2,248 posts. No equations, parameters, derivations, or predictions appear in the abstract or described content. Claims rest directly on reported survey responses and post coding rather than any reduction to fitted inputs or self-citation chains. The work is self-contained against its own data collection and does not invoke uniqueness theorems or ansatzes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Self-reported survey responses from Reddit users accurately capture their professional backgrounds, behaviors, and challenges regarding privacy law compliance.
Reference graph
Works this paper leans on
-
[1]
Sanju Ahuja, Johanna Gunawan, Nataliia Bielova, and Cristiana Teixeira Santos
-
[2]
InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems
Dark Patterns and the EU Digital Services Act: Mapping Autonomy Viola- tions and Design Factors. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. 1–30
2026
-
[3]
Noura Alomar and Serge Egelman. 2022. Developers Say the Darnedest Things: Privacy Compliance Processes Followed by Developers of Child-Directed Apps. Proceedings on Privacy Enhancing Technologies(2022)
2022
-
[4]
Arash Amini, Yigit Ege Bayiz, Ashwin Ram, Radu Marculescu, and Ufuk Topcu
-
[5]
InProceedings of the International AAAI Conference on Web and Social Media
News Source Credibility Assessment: A Reddit Case Study. InProceedings of the International AAAI Conference on Web and Social Media. 68–82
-
[6]
Wilder Baldwin, Shashank Chintakuntla, Shreyah Parajuli, Ali Pourghasemi, Ryan Shanz, and Sepideh Ghanavati. 2025. Generating Privacy Stories from Software Documentation. InProceedings of the IEEE International Requirements Engineering Conference (RE). 432–440
2025
-
[7]
Bryce Boe. 2026. PRAW: Python Reddit API Wrapper. https://pypi.org/project/ praw/ Accessed: May 9, 2026
2026
-
[8]
Norman Breslow. 1970. A Generalized Kruskal-Wallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship.Biometrika57, 3 (1970), 579–594
1970
-
[9]
Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language Models Are Few-Shot Learners.Advances in Neural Information Processing Systems33 (2020), 1877–1901
2020
-
[10]
Centre for Information Policy Leadership. 2017. Top 10 Operational Im- pacts of the GDPR: Implementation Challenges and Strategies. https: //www.informationpolicycentre.com/uploads/5/7/1/0/57104281/final_cipl_ gdpr_implementation_challenges_summary_document_27_april_2017.pdf. Accessed February 10, 2025
2017
-
[11]
Yung-Sheng Chang, Yan Zhang, and Jacek Gwizdka. 2021. The effects of in- formation source and eHealth literacy on consumer health information credi- bility evaluation behavior.Computers in Human Behavior115 (2021), 106629. https://doi.org/10.1016/j.chb.2020.106629
-
[12]
Seung Youn Chyung, Katherine Roberts, Ieva Swanson, and Andrea Hankinson
-
[13]
Performance Improvement56, 10 (2017), 15–23
Evidence-Based Survey Design: The Use of a Midpoint on the Likert Scale. Performance Improvement56, 10 (2017), 15–23
2017
-
[14]
Jacob Cohen. 1992. Statistical Power Analysis.Current Directions in Psychological Science1, 3 (1992), 98–101
1992
- [15]
-
[16]
Zachary Eberhart and Collin McMillan. 2021. Dialogue Management for Interac- tive API Search. In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 274–285
2021
-
[17]
Zachary Eberhart and Collin McMillan. 2022. Generating Clarifying Questions for Query Refinement in Source Code Search. In2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 140–151
2022
-
[18]
European Union. 2024. The EU General Data Protection Regulation (GDPR). http://www.eugdpr.org/. Accessed February 10, 2025
2024
-
[19]
Europrivacy. 2024. EIPACC – European Information Privacy and Cybersecurity Certification. https://www.europrivacy.org/eipacc. Accessed May 22, 2025
2024
-
[20]
B. J. Fogg, Cathy Soohoo, David R. Danielson, Leslie Marable, Julianne Stanford, and Ellen R. Tauber. 2003. How do users evaluate the credibility of Web sites? a study with over 2,500 participants. InProceedings of the 2003 Conference on Designing for User Experiences(San Francisco, California)(DUX ’03). Association for Computing Machinery, New York, NY, ...
-
[21]
Adrian Ford, Ameer Al-Nemrat, Seyed Ghorashi, and Julia Davidson. 2021. The Impact of Data Breach Announcements on Company Value in European Markets. InWEIS 2021: The 20th Annual Workshop on the Economics of Information Security. 1–8
2021
-
[22]
Adrian Ford, Ameer Al-Nemrat, Seyed Ali Ghorashi, and Julia Davidson. 2023. The Impact of GDPR Infringement Fines on the Market Value of Firms.Information & Computer Security31, 1 (2023), 51–64
2023
-
[23]
2013.Survey research methods
Floyd J Fowler Jr. 2013.Survey research methods. Sage publications
2013
-
[24]
Ron Garland et al . 1991. The Mid-Point on a Rating Scale: Is It Desirable? Marketing Bulletin2, 1 (1991), 66–70
1991
-
[25]
David Gefen, Shahin Jabbari, Rezvaneh Shadi Rezapour, Aria Pessianzadeh, Kshitij Kayastha, and Hilde Van den Bulck. 2024. The Evolving Meaning of Trust and Risk in Reddit Discourse About ChatGPT.AMCIS Proceedings(2024)
2024
-
[26]
1996.A Guide to Chi-Squared Testing
Priscilla E Greenwood and Michael S Nikulin. 1996.A Guide to Chi-Squared Testing. John Wiley & Sons
1996
-
[27]
Irit Hadar, Tomer Hasson, Oshrat Ayalon, Eran Toch, Michael Birnhack, Sofia Sherman, and Arod Balissa. 2018. Privacy by Designers: Software Developers’ Privacy Mindset.Journal of Empirical Software Engineering(2018)
2018
-
[28]
Hamza Harkous, Sai Teja Peddinti, Rishabh Khandelwal, Animesh Srivastava, and Nina Taft. 2022. Hark: A Deep Learning System for Navigating Privacy Feedback at Scale. In2022 IEEE Symposium on Security and Privacy (SP). IEEE, 2469–2486
2022
-
[29]
Charles R Henderson. 1953. Estimation of variance and covariance components. Biometrics9, 2 (1953), 226–252
1953
-
[30]
Rashina Hoda. 2025. Qualitative Research with Socio-Technical Grounded Theory A Practical Guide to Qualitative Data Analysis and Theory Development in the Digital World.Innovations(2025)
2025
-
[31]
Those Things are Written by Lawyers, and Programmers are Reading That
Stefan Albert Horstmann, Samuel Domiks, Marco Gutfleisch, Mindy Tran, Yasemin Acar, Veelasha Moonsamy, and Alena Naiakshina. 2024. “Those Things are Written by Lawyers, and Programmers are Reading That. ” Mapping the Com- munication Gap Between Software Developers and Privacy Experts.Proceedings on Privacy Enhancing Technologies(2024)
2024
-
[32]
Sorry for Bugging you so much
Stefan Albert Horstmann, Sandy Hong, David Klein, Raphael Serafini, Martin Degeling, Martin Johns, Veelasha Moonsamy, and Alena Naiakshina. 2025. “Sorry for Bugging you so much. ” Exploring Developers’ Behavior Towards Privacy- Compliant Implementation. In2025 IEEE Symposium on Security and Privacy (SP). IEEE, 1215–1233
2025
-
[33]
I need to learn better searching tactics for privacy policy laws
Stefan Albert Horstmann, Sandy Hong, Maziar Niazian, Cristiana Santos, and Alena Naiakshina. 2025. “I Need to Learn Better Searching Tactics for Privacy Policy Laws. ” Investigating Software Developers’ Behavior When Using Sources on Privacy Issues.arXiv preprint arXiv:2511.08059(Nov. 2025). https://arxiv.org/ abs/2511.08059v1
-
[34]
Hugging Face. 2025. Hugging Face – The AI Community Building The Future. https://huggingface.co/. Accessed: November 13, 2025
2025
-
[35]
International Association of Privacy Professionals. 2024. IAPP Certifications. https://iapp.org/certify/. Accessed: June 1, 2024
2024
-
[36]
Leonardo Horn Iwaya, Muhammad Ali Babar, and Awais Rashid. 2023. Privacy Engineering in the Wild: Understanding the Practitioners’ Mindset, Organisa- tional Aspects, and Current Practices.IEEE Transactions on Software Engineering (2023)
2023
- [37]
- [38]
-
[39]
Vijayanta Jain, Sanonda Datta Gupta, Sepideh Ghanavati, Sai Teja Peddinti, and Collin McMillan. 2022. PAcT: Detecting and Classifying Privacy Behavior of Android Applications. InProc. of the 15th ACM Conf. on Security and Privacy in Wireless and Mobile Networks. ACM, 104–118
2022
-
[40]
Tianshi Li, Yuvraj Agarwal, and Jason I Hong. 2018. Coconut: An IDE Plugin for Developing Privacy-Friendly Apps.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies2, 4 (2018), 1–35
2018
-
[41]
Tianshi Li, Elizabeth Louie, Laura Dabbish, and Jason I Hong. 2021. How Develop- ers Talk About Personal Data and What It Means for User Privacy: A Case Study of A Developer Forum on Reddit.Proceedings of the ACM on Human-Computer Interaction4, CSCW3 (2021), 1–28
2021
-
[42]
Jenny T Liang, Chenyang Yang, and Brad A Myers. 2024. A Large-Scale Survey on The Usability of AI Programming Assistants: Successes and Challenges. In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–13
2024
-
[43]
Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, and David Vilares. 2024. Con- trasting Linguistic Patterns in Human and LLM-Generated News Text.Artificial Intelligence Review57 (2024), 265–292
2024
-
[44]
Jai Kruthunz Naveen Kumar, Aishwarya Umeshkumar Surani, Harkirat Singh, and Sanchari Das. 2026. Privacy Discourse and Emotional Dynamics in Mental Health Information Interaction on Reddit. InProceedings of the 2026 Conference on Human Information Interaction and Retrieval. Association for Computing Machinery, 491–496
2026
-
[45]
Norton and Travis D
Thomas B. Norton and Travis D. Breaux. 2026. Privacy by Design’s Last Mile: Practitioner Evidence of Translation Friction at the Legal-Engineering Interface. In Submission: International Data Privacy Law(2026)
2026
-
[46]
OpenAI. 2025. API Platform. https://openai.com/api/. Accessed: November 13, 2025
2025
- [47]
-
[48]
Jonathan Parsons, Michael Schrider, Oyebanjo Ogunlela, and Sepideh Ghana- vati. 2023. Understanding Developers Privacy Concerns Through Reddit Thread 14 Proceedings on Privacy Enhancing Technologies YYYY(X) Analysis.Joint Proc. of REFSQ-2023 Workshops, Doctoral Symposium, Posters & Tools Track and Journal Early Feedback co-located with the 28th Int. Conf....
2023
-
[49]
PECB. 2024. Certified Data Protection Officer (CDPO). https: //pecb.com/en/education-and-certification-for-individuals/gdpr/certified- data-protection-officer. Accessed: May 22, 2025
2024
- [50]
-
[51]
Maxwell Prybylo, Sara Haghighi, Sai Teja Peddinti, and Sepideh Ghanavati. 2024. Evaluating Privacy Perceptions, Experience, and Behavior of Software Devel- opment Teams. InTwentieth Symposium on Usable Privacy and Security (SOUPS 2024). USENIX Association, 101–120
2024
-
[52]
Sarah Santos, Sara Haghighi, Sepideh Ghanavati, Travis D Breaux, and Thomas B Norton. 2024. Patterns of Inquiry in a Community Forum for Legal Compliance with Privacy Law. In2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW). IEEE, 251–259
2024
-
[53]
Raphael Serafini, Stefan Albert Horstmann, and Alena Naiakshina. 2024. Engaging Company Developers in Security Research Studies: A Comprehensive Literature Review and Quantitative Survey. In33rd USENIX Security Symposium (USENIX Security 24). 3277–3294
2024
-
[54]
Hanshu Shen, Lyukesheng Shen, Wenqi Wu, and Kejun Zhang. 2025. IdeationWeb: Tracking the Evolution of Design Ideas in Human-AI Co-Creation. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems(Yokohama Japan)(CHI’25). Association for Computing Machinery, New York, NY, USA, Article 146, 19 pages. https://doi.org/10.1145/3706598.3713375
-
[55]
Daniel J Solove. 2005. A Taxonomy of Privacy.U. Pa. l. Rev.154 (2005), 477
2005
-
[56]
Georgios Spanos and Lefteris Angelis. 2016. The Impact of Information Security Events to the Stock Market: A Systematic Literature Review.Computers & Security (2016), 216–229
2016
-
[57]
Charles Spearman. 1961. The Proof and Measurement of Association Between Two Things. (1961)
1961
-
[58]
Mohammad Tahaei, Julia Bernd, and Awais Rashid. 2022. Privacy, permissions, and the health app ecosystem: A stack overflow exploration. InProceedings of the 2022 European Symposium on Usable Security. 117–130
2022
-
[59]
Mohammad Tahaei, Alisa Frik, and Kami Vaniea. 2021. Deciding on personalized ads: Nudging developers about user privacy. InSeventeenth Symposium on Usable Privacy and Security (SOUPS 2021). 573–596
2021
-
[60]
Mohammad Tahaei, Alisa Frik, and Kami Vaniea. 2021. Privacy champions in software teams: Understanding their motivations, strategies, and challenges. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15
2021
-
[61]
Mohammad Tahaei, Tianshi Li, and Kami Vaniea. 2022. Understanding Privacy- Related Advice on Stack Overflow.Proceedings on Privacy Enhancing Technologies 2022, 2 (2022), 114–131. https://doi.org/doi:10.2478/popets-2022-0038
-
[62]
Mohammad Tahaei, Marvin Ramokapane, Tianshi Li, Jason I Hong, and Awais Rashid. 2022. Charting App Developers’ Journey through Privacy Regulation Features in Ad Networks. InThe 22nd Privacy Enhancing Technologies Symposium. 33–56
2022
-
[63]
Mohammad Tahaei and Kami Vaniea. 2019. A survey on developer-centred security. In2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE, 129–138
2019
-
[64]
Mohammad Tahaei, Kami Vaniea, and Awais Rashid. 2022. Embedding Privacy into Design through Software Developers: Challenges and Solutions.IEEE Security & Privacy(2022), 49–57
2022
-
[65]
Mohammad Tahaei, Kami Vaniea, and Naomi Saphra. 2020. Understanding Privacy-Related Questions on Stack Overflow. InProceedings of the 2020 CHI conference on human factors in computing systems. 1–14
2020
-
[66]
Ruixiang Tang, Yu-Neng Chuang, and Xia Hu. 2024. The Science of Detecting LLM-Generated Text.Commun. ACM(2024)
2024
-
[67]
Robert Tibshirani. 1996. Regression Shrinkage and Selection via the Lasso.Journal of the Royal Statistical Society Series B: Statistical Methodology(1996), 267–288
1996
- [68]
-
[69]
Yang Wang. 2009. Privacy-Enhancing Technologies. InHandbook of research on social and organizational liabilities in information security. IGI Global, 203–227
2009
-
[70]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-Thought Prompting Elicits Rea- soning in Large Language Models.Advances in neural information processing systems(2022), 24824–24837
2022
-
[71]
Yusuke Yamamoto and Katsumi Tanaka. 2011. Enhancing credibility judg- ment of web search results. InProceedings of the SIGCHI Conference on Hu- man Factors in Computing Systems(Vancouver, BC, Canada)(CHI ’11). Asso- ciation for Computing Machinery, New York, NY, USA, 1235–1244. https: //doi.org/10.1145/1978942.1979126
-
[72]
Mohammad Amin Zadenoori, Liping Zhao, Waad Alhoshan, and Alessio Ferrari
-
[73]
InRequirements Engineering: Foundation for Software Quality: 31st International Working Conference, REFSQ 2025, Barcelona, Spain, April 7–10, 2025, Proceedings
Automatic Prompt Engineering: The Case of Requirements Classification. InRequirements Engineering: Foundation for Software Quality: 31st International Working Conference, REFSQ 2025, Barcelona, Spain, April 7–10, 2025, Proceedings. Springer-Verlag, 217–225
2025
-
[74]
J.D. Zamfirescu-Pereira, Eunice Jun, Michael Terry, Qian Yang, and Bjoern Hart- mann. 2025. Beyond Code Generation: LLM-supported Exploration of the Pro- gram Design Space. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 153, 17 pages. https://doi.org...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.