Challenges in Android Data Disclosure: An Empirical Study

Eric Bodden; Michael Schlichtig; Mohamed Soliman; Mugdha Khedkar

arxiv: 2601.20459 · v2 · submitted 2026-01-28 · 💻 cs.SE

Challenges in Android Data Disclosure: An Empirical Study

Mugdha Khedkar , Michael Schlichtig , Mohamed Soliman , Eric Bodden This is my paper

Pith reviewed 2026-05-16 10:56 UTC · model grok-4.3

classification 💻 cs.SE

keywords Android developersData Safety Sectionprivacy disclosureempirical studyGoogle Play Storedata collectiondeveloper challengescompliance reporting

0 comments

The pith

Android developers confidently recognize data their apps collect but struggle to turn that into accurate, compliant disclosures on Google's Data Safety Section form.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study surveyed 41 Android developers and examined 172 online forum threads involving 642 others to map how they handle privacy data reporting. Developers typically classify data into Google's categories by hand or skip the step altogether, turning instead to online resources for help. While they report strong confidence in spotting the data their code actually gathers, they show low confidence in mapping it correctly to the required form fields. The main sticking points are pinpointing which data counts as privacy-relevant, unclear instructions on the form itself, and fear that mismatches will trigger app rejection by Google. These patterns point to a gap between knowing what data exists and producing disclosures that satisfy store policies.

Core claim

Through survey responses from 41 developers and analysis of discussions by 642 more, the paper establishes that developers can identify privacy-related data in their apps yet encounter repeated obstacles when translating that knowledge into Data Safety Section disclosures, including manual categorization difficulties, incomplete form comprehension, and worries over rejection for non-compliance with Google's rules.

What carries the argument

An empirical combination of a targeted developer survey and large-scale forum thread analysis that surfaces patterns in how developers map app data to the fixed DSS categories.

If this is right

Clearer official guidance on data categorization would reduce reliance on ad-hoc manual work and external forums.
Tooling that links code scanning directly to DSS categories could lower the chance of incomplete or mismatched disclosures.
Reduced rejection risk from disclosure errors would encourage more developers to report data accurately rather than omit entries.
Widespread adoption of better support materials would shrink the gap between data recognition and compliant reporting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar disclosure challenges likely appear in other app-store privacy sections that use fixed category lists.
Integrating automated data-flow analysis into developer IDEs could test whether the reported confidence gap shrinks in practice.
Training resources focused on form navigation rather than data detection might address the specific confidence drop the study identifies.

Load-bearing premise

The 41 survey participants and 172 forum threads speak for the wider group of Android developers who must fill out the Data Safety Section.

What would settle it

A larger random sample of active Android developers who report high confidence and low effort when completing the DSS form without the listed difficulties would contradict the central finding.

Figures

Figures reproduced from arXiv: 2601.20459 by Eric Bodden, Michael Schlichtig, Mohamed Soliman, Mugdha Khedkar.

**Figure 1.** Figure 1: Data Safety Section of Signal [11]. ○1 Data sharing. ○2 Data collection. ○3 Security practices (encryption and data deletion). 1. 2. 3 [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Data collected part of the DSS of Signal [11]. 1. Data category. 2. Data type. 3. Purpose. the app’s Play Store page, directly informing end-users about its privacy practices (cf [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: provides an overview of the study design. Our approach combines two complementary data sources: (1) a survey of Android developers capturing their self-reported practices and perceptions, and (2) a qualitative analysis of online developer discussions that reveal naturally occurring challenges in community settings. The survey directly addresses RQ1 and RQ2, while both data sources collectively answer RQ3 t… view at source ↗

**Figure 4.** Figure 4: Methods and resources participants use to categorize privacy-related data collected by their app into the DSS data [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Survey participants’ responses to the questions about confidence levels for different processes (RQ2). [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Bar chart showing the distribution of challenges [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

read the original abstract

Current legal frameworks enforce that Android developers accurately report the data their apps collect. However, large codebases can make this reporting challenging. This paper employs an empirical approach to understand developers' experience with Google Play Store's Data Safety Section (DSS) form. We first survey 41 Android developers to understand how they categorize privacy-related data into DSS categories and how confident they feel when completing the DSS form. To gain a broader and more detailed view of the challenges developers encounter during the process, we complement the survey with an analysis of 172 online developer discussions, capturing the perspectives of 642 additional developers. Together, these two data sources represent insights from 683 developers. Our findings reveal that developers often manually classify the privacy-related data their apps collect into the data categories defined by Google-or, in some cases, omit classification entirely-and rely heavily on existing online resources when completing the form. Moreover, developers are generally confident in recognizing the data their apps collect, yet they lack confidence in translating this knowledge into DSS-compliant disclosures. Key challenges include issues in identifying privacy-relevant data to complete the form, limited understanding of the form, and concerns about app rejection due to discrepancies with Google's privacy requirements. These results underscore the need for clearer guidance and more accessible tooling to support developers in meeting privacy-aware reporting obligations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper supplies fresh survey and forum data on Android developers' struggles with the Data Safety Section but rests on a small convenience sample whose representativeness is unclear.

read the letter

The paper's core finding is that Android developers feel confident spotting the data their apps collect but struggle to map it correctly onto Google's Data Safety Section categories. This comes from a new survey of 41 developers plus analysis of 172 forum threads involving 642 more. The work is new in its focus on the DSS form specifically. It combines direct survey questions on categorization confidence with broader discussion threads to identify issues like manual classification, limited form understanding, and worries about rejection. The total of 683 developer perspectives is a reasonable starting point for this kind of study. It does a solid job documenting the practical hurdles without overclaiming. The authors note reliance on online resources and occasional omissions, which matches what one might expect from large codebases. The main limitation is the sample. The survey size is small, and both the survey and forum data likely suffer from self-selection—developers who respond or post publicly may not represent the average. No recruitment details or response rates are mentioned, which weakens how far the generally confident yet translation-challenged result can be pushed. The forum analysis could also be skewed by search terms or date ranges not specified. This is for people studying mobile privacy tools or compliance processes. It gives concrete challenges that could inform better guidance or automation. A reader interested in empirical software engineering on privacy would get value from the reported experiences. I would take it to a reading group to talk about survey design in developer studies. I probably would not cite it yet because of the sample issues. It deserves peer review to see if the authors can address the generalizability concerns with more data or clearer methods.

Referee Report

3 major / 2 minor

Summary. The paper reports an empirical study of Android developers' experiences completing Google's Data Safety Section (DSS) form. It combines a survey of 41 developers with thematic analysis of 172 online forum threads (involving 642 developers) to examine how developers categorize privacy-related data, their confidence levels, and the challenges they encounter. The central claims are that developers are generally confident in recognizing the data their apps collect but lack confidence in mapping this knowledge to DSS categories; they frequently perform manual classification or omit data and rely on online resources, with key difficulties including identifying privacy-relevant data, understanding the form, and concerns over app rejection due to mismatches with Google's requirements.

Significance. If the reported patterns hold for the broader population, the study supplies useful observational evidence on the practical friction points in Android privacy disclosure. The mixed-methods design (survey plus forum corpus) provides convergent qualitative support for the stated challenges and could usefully inform the design of clearer documentation or automated tooling. The work is grounded in real developer artifacts rather than purely theoretical analysis.

major comments (3)

[Survey methodology] Survey methodology section: no information is supplied on recruitment channels, response rate, or participant demographics for the n=41 sample. Because the headline claim about developers being 'generally confident' in data recognition rests on generalizing self-reported attitudes from this convenience sample, the absence of these details leaves self-selection bias unaddressed and weakens the external validity of the confidence and challenge findings.
[Forum analysis] Forum analysis section: selection criteria for the 172 threads (search terms, date range, exclusion rules) are not stated. Without them, it is impossible to evaluate whether the 642 developers represented in the corpus are systematically biased toward privacy-aware or frustrated individuals, directly affecting the reliability of the broader view offered as complementary evidence.
[Results] Results on confidence (survey and forum findings): the paper reports self-reported confidence levels but provides no quantitative validation (e.g., comparison against actual disclosure accuracy or external audit). This limits how strongly the distinction between 'confident recognition' and 'low translation confidence' can be asserted as a general phenomenon.

minor comments (2)

[Abstract] Abstract: the total of 683 developers is presented as the combined insight base, but any possible overlap between survey respondents and forum posters is not addressed; a brief clarification would improve transparency.
[Discussion or Limitations] The paper would benefit from an explicit limitations subsection that directly discusses sample representativeness and the implications for generalizing the 'generally confident yet translation-challenged' characterization.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our manuscript. We address each of the major comments below and indicate the revisions we will make to improve the paper.

read point-by-point responses

Referee: [Survey methodology] Survey methodology section: no information is supplied on recruitment channels, response rate, or participant demographics for the n=41 sample. Because the headline claim about developers being 'generally confident' in data recognition rests on generalizing self-reported attitudes from this convenience sample, the absence of these details leaves self-selection bias unaddressed and weakens the external validity of the confidence and challenge findings.

Authors: We agree with this assessment. The current manuscript does not provide these details, which is an oversight. In the revised version, we will add a new subsection under Methodology detailing the recruitment channels (posts on Reddit's r/androiddev and XDA Developers), the number of invitations sent and response rate, and participant demographics including experience levels and app categories. We will also explicitly discuss the limitations of the convenience sample and potential self-selection bias in the Discussion section, adjusting our claims about generalizability accordingly. revision: yes
Referee: [Forum analysis] Forum analysis section: selection criteria for the 172 threads (search terms, date range, exclusion rules) are not stated. Without them, it is impossible to evaluate whether the 642 developers represented in the corpus are systematically biased toward privacy-aware or frustrated individuals, directly affecting the reliability of the broader view offered as complementary evidence.

Authors: We acknowledge that the selection criteria were not explicitly stated. For the revision, we will expand the 'Forum Analysis' section to specify the search terms (such as 'Data Safety Section', 'DSS form', 'Google Play data safety'), the time period covered (2021-2023), the sources (Reddit, XDA Developers, Stack Overflow), and the exclusion criteria (e.g., threads not in English or unrelated to data disclosure). This will help readers assess any potential biases in the forum data. revision: yes
Referee: [Results] Results on confidence (survey and forum findings): the paper reports self-reported confidence levels but provides no quantitative validation (e.g., comparison against actual disclosure accuracy or external audit). This limits how strongly the distinction between 'confident recognition' and 'low translation confidence' can be asserted as a general phenomenon.

Authors: The referee is correct that our findings on confidence are based on self-reported data without quantitative validation against actual disclosure accuracy. As an exploratory study using mixed methods, we did not conduct external audits. In the revision, we will add a dedicated Limitations subsection clarifying that the reported distinction between recognition confidence and translation confidence is based on self-reports and may not correspond to objective measures. We will frame the claims more cautiously and propose future research directions for validation studies. No such validation data exists in our current dataset. revision: partial

Circularity Check

0 steps flagged

No circularity: purely observational empirical study with no derivations or fitted predictions

full rationale

The paper reports results from a survey of 41 Android developers and thematic analysis of 172 forum threads involving 642 additional developers. No equations, mathematical models, parameter fitting, or first-principles derivations appear anywhere in the manuscript. Claims about developer confidence and challenges are presented as direct summaries of self-reported survey responses and forum content, without any step that reduces a 'prediction' or result back to the input data by construction. Sample representativeness is a validity threat but does not constitute circularity under the defined patterns. The study is self-contained as an empirical report.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a purely empirical study; it introduces no mathematical free parameters, formal axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5533 in / 1211 out tokens · 59805 ms · 2026-05-16T10:56:43.703604+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

[1]

The European parliament and the council of the European union

2018. The European parliament and the council of the European union. General Data Protection Regulation (GDPR). Retrieved Oct 13, 2025 from https://eur- lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679

work page 2018
[2]

GDPR Article 13

2018. GDPR Article 13. Retrieved Oct 13, 2025 from https://gdpr-info.eu/art-13- gdpr/

work page 2018
[3]

GDPR Article 4

2018. GDPR Article 4. Retrieved Oct 13, 2025 from https://gdpr-info.eu/art-4- gdpr/

work page 2018
[4]

GDPR penalties

2018. GDPR penalties. Retrieved Oct 13, 2025 from https://gdpr-info.eu/issues/ fines-penalties/

work page 2018
[5]

2022. AdMob. Retrieved Oct 13, 2025 from https://admob.google.com/home/

work page 2022
[6]

AppLovin

2022. AppLovin. Retrieved Oct 13, 2025 from https://www.applovin.com/

work page 2022
[7]

Data Safety Section

2022. Data Safety Section. Retrieved Oct 13, 2025 from https://blog.google/ products/google-play/data-safety/

work page 2022
[8]

Get more information about your apps in Google Play

2022. Get more information about your apps in Google Play. Retrieved Oct 20, 2025 from https://blog.google/products/google-play/data-safety/ Challenges in Android Data Disclosure: An Empirical Study MOBILESoft ’26, April 12, 2026, Rio de Janeiro, Brazil

work page 2022
[9]

Google’s data types for DSS

2022. Google’s data types for DSS. Retrieved Oct 13, 2025 from https://support.google.com/googleplay/android-developer/answer/10787469# zippy=%2Cdata-types

work page arXiv 2022
[10]

Privado.ai

2022. Privado.ai. Retrieved Oct 13, 2025 from https://www.privado.ai/data- safety-report

work page 2022
[11]

Signal Private Messenger

2022. Signal Private Messenger. Retrieved Oct 13, 2025 from https://play.google. com/store/apps/datasafety?id=org.thoughtcrime.securesms&hl=en

work page 2022
[12]

2022. Unity. Retrieved Oct 13, 2025 from https://unity.com/

work page 2022
[13]

See No Evil: Loopholes in Google’s Data Safety Labels Keep Companies in the Clear and Consumers in the Dark

2023. See No Evil: Loopholes in Google’s Data Safety Labels Keep Companies in the Clear and Consumers in the Dark. Retrieved Oct 13, 2025 from https: //foundation.mozilla.org/en/campaigns/googles-data-safety-labels/

work page 2023
[14]

Google Play’s Security: 2.36 Million Apps Blocked For Violations In 2024

2024. Google Play’s Security: 2.36 Million Apps Blocked For Violations In 2024. Retrieved Oct 13, 2025 from https://www.forbes.com/sites/alexvakulov/2025/02/ 02/google-plays-security-236m-apps-blocked-for-violations-in-2024/

work page 2024
[15]

App Store

2025. App Store. Retrieved Oct 13, 2025 from https://www.apple.com/app-store/

work page 2025
[16]

Retrieved Oct 13, 2025 from https://www.codeproject.com/

2025.Code Project. Retrieved Oct 13, 2025 from https://www.codeproject.com/

work page 2025
[17]

Retrieved Oct 13, 2025 from https://dzone.com/

2025.DZone. Retrieved Oct 13, 2025 from https://dzone.com/

work page 2025
[18]

Firebase Analytics

2025. Firebase Analytics. Retrieved Oct 13, 2025 from https://firebase.google. com/docs/analytics/get-started?platform=web

work page 2025
[19]

GitHub API

2025. GitHub API. Retrieved Oct 13, 2025 from https://docs.github.com/en/rest? apiVersion=2022-11-28

work page 2025
[20]

Google Play Console

2025. Google Play Console. Retrieved Oct 13, 2025 from https://play.google. com/console/signup

work page 2025
[21]

Google Play Store

2025. Google Play Store. Retrieved Oct 13, 2025 from https://play.google.com/ store/

work page 2025
[22]

Retrieved Oct 13, 2025 from https://news.ycombinator.com/

2025.Hacker news. Retrieved Oct 13, 2025 from https://news.ycombinator.com/

work page 2025
[23]

New Android app releases per month

2025. New Android app releases per month. Retrieved Oct 13, 2025 from https://www.appbrain.com/stats/number-of-android-apps

work page 2025
[24]

2025. Notifee. Retrieved Oct 13, 2025 from https://notifee.app/

work page 2025
[25]

librosa/librosa: 0.6.3,

2025. Paper artifacts. Retrieved Oct 22, 2025 from https://doi.org/10.5281/zenodo. 17416187

work page doi:10.5281/zenodo 2025
[26]

Reddit API

2025. Reddit API. Retrieved Oct 13, 2025 from https://www.reddit.com/dev/api/

work page 2025
[27]

Retrieved Oct 13, 2025 from https://www.sitepoint.com/

2025.Site point. Retrieved Oct 13, 2025 from https://www.sitepoint.com/

work page 2025
[28]

Stack Exchange API

2025. Stack Exchange API. Retrieved Oct 13, 2025 from https://api.stackexchange. com/

work page 2025
[29]

Retrieved Oct 13, 2025 from https://xdaforums.com/

2025.XDA forums. Retrieved Oct 13, 2025 from https://xdaforums.com/

work page 2025
[30]

Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, and Carmine Spagnuolo

work page
[31]

InCompanion Proceedings of the ACM Web Conference 2023(Austin, TX, USA)(WWW ’23 Companion)

The Age of Snippet Programming: Toward Understanding Developer Communities in Stack Overflow and Reddit. InCompanion Proceedings of the ACM Web Conference 2023(Austin, TX, USA)(WWW ’23 Companion). Association for Computing Machinery, New York, NY, USA, 1218–1224. https://doi.org/10. 1145/3543873.3587673

work page arXiv 2023
[32]

Nia Castelly and Fergus Hurley. 2022. Introducing Checks: simplifying privacy for app developers - Google: The Keyword. Retrieved April 10, 2025 from https://blog.google/technology/area-120/checks/

work page 2022
[33]

In: Teo T, editor

Victoria Clarke and Virginia Braun. 2014.Thematic Analysis. Springer New York, New York, NY, 1947–1952. https://doi.org/10.1007/978-1-4614-5583-7_311

work page doi:10.1007/978-1-4614-5583-7_311 2014
[34]

Juliet Corbin and Anselm Strauss. 1990. Grounded Theory Research: Procedures, Canons and Evaluative Criteria.Zeitschrift für Soziologie19, 6 (1990), 418–427. https://doi.org/doi:10.1515/zfsoz-1990-0602

work page doi:10.1515/zfsoz-1990-0602 1990
[35]

Davis, and Chris Brown

Lucas Franke, Huayu Liang, Sahar Farzanehpour, Aaron Brantly, James C. Davis, and Chris Brown. 2024. An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software. In Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement(Barcelona, Spain)(ESEM ’24). A...

work page arXiv 2024
[36]

Aniketh Girish, Joel Reardon, Juan Tapiador, Srdjan Matic, and Narseo Vallina- Rodriguez. 2025. Your Signal, Their Data: An Empirical Privacy Analysis of Wireless-scanning SDKs in Android. arXiv:2503.15238 [cs.CR] https://arxiv.org/ abs/2503.15238

work page arXiv 2025
[37]

Sandra Höltervennhoff, Noah Wöhler, Arne Möhle, Marten Oltrogge, Yasemin Acar, Oliver Wiese, and Sascha Fahl. 2024. A Mixed-Methods Study on User Experiences and Challenges of Recovery Codes for an End-to-End Encrypted Service. In33rd USENIX Security Symposium (USENIX Security 24). USENIX As- sociation, Philadelphia, PA, 7267–7284. https://www.usenix.org/...

work page 2024
[38]

Hiroki Inayoshi, Shohei Kakei, and Shoichi Saito. 2024. Detection of Incon- sistencies between Guidance Pages and Actual Data Collection of Third-party SDKs in Android Apps. InProceedings of the IEEE/ACM 11th International Con- ference on Mobile Software Engineering and Systems(Lisbon, Portugal)(MOBILE- Soft ’24). Association for Computing Machinery, New ...

work page doi:10.1145/3647632.3647991 2024
[39]

Tahira Iqbal, Moniba Khan, Kuldar Taveter, and Norbert Seyff. 2021. Mining Reddit as a New Source for Software Requirements. In2021 IEEE 29th International Requirements Engineering Conference (RE). 128–138. https://doi.org/10.1109/ RE51729.2021.00019

work page arXiv 2021
[40]

Patrick Gage Kelley, Joanna Bresee, Lorrie Faith Cranor, and Robert W. Reeder

work page
[41]

nutrition label

A "nutrition label" for privacy. InProceedings of the 5th Symposium on Usable Privacy and Security(Mountain View, California, USA)(SOUPS ’09). Association for Computing Machinery, New York, NY, USA, Article 4, 12 pages. https: //doi.org/10.1145/1572532.1572538

work page doi:10.1145/1572532.1572538
[42]

Patrick Gage Kelley, Lorrie Faith Cranor, and Norman Sadeh. 2013. Privacy as part of the app decision-making process. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Paris, France)(CHI ’13). Association for Computing Machinery, New York, NY, USA, 3393–3402. https://doi.org/10.1145/ 2470654.2466466

work page arXiv 2013
[43]

Rishabh Khandelwal, Asmit Nayak, Paul Chung, and Kassem Fawaz. 2023. Un- packing Privacy Labels: A Measurement and Developer Perspective on Google’s Data Safety Section. arXiv:2306.08111 [cs.CY]

work page arXiv 2023
[44]

Mugdha Khedkar, Ambuj Kumar Mondal, and Eric Bodden. 2026. A Study of Privacy-Related Data Collected by Android Apps.Automated Software Engineer- ing33, 2 (2026), 45. https://doi.org/10.1007/s10515-025-00589-3

work page doi:10.1007/s10515-025-00589-3 2026
[45]

Mugdha Khedkar, Ambuj Kumar Mondal, and Eric Bodden. 2024. Do Android App Developers Accurately Report Collection of Privacy-Related Data?. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops(Sacramento, CA, USA)(ASEW ’24). Association for Computing Ma- chinery, New York, NY, USA, 176–186. https://doi.or...

work page doi:10.1145/3691621.3694949 2024
[46]

Ruiyin Li, Peng Liang, Mohamed Soliman, and Paris Avgeriou. 2021. Under- standing Architecture Erosion: The Practitioners’ Perceptive. In2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). 311–322. https: //doi.org/10.1109/ICPC52881.2021.00037

work page doi:10.1109/icpc52881.2021.00037 2021
[47]

Tianshi Li, Lorrie Faith Cranor, Yuvraj Agarwal, and Jason I. Hong. 2024. Matcha: An IDE Plugin for Creating Accurate Privacy Nutrition Labels.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 1 (March 2024), 1–38. https://doi.org/10.1145/3643544

work page doi:10.1145/3643544 2024
[48]

Tianshi Li, Kayla Reiman, Yuvraj Agarwal, Lorrie Faith Cranor, and Jason I. Hong

work page
[49]

InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22)

Understanding Challenges for Developers to Create Accurate Privacy Nutrition Labels. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 588, 24 pages. https://doi.org/10.1145/ 3491102.3502012

work page arXiv 2022
[50]

Anthony Peruma, Timothy Huo, Ana Catarina Araújo, Jake Imanaka, and Rick Kazman. 2024. A Developer-Centric Study Exploring Mobile Application Security Practices and Challenges. arXiv:2408.09032 [cs.CR] https://arxiv.org/abs/2408. 09032

work page arXiv 2024
[51]

Del Alamo, and Norman Sadeh

David Rodriguez, Akshath Jain, Jose M. Del Alamo, and Norman Sadeh. 2023. Comparing Privacy Label Disclosures of Apps Published in both the App Store and Google Play Stores. In2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 150–157. https://doi.org/10.1109/EuroSPW59978.2023. 00022

work page doi:10.1109/eurospw59978.2023 2023
[52]

Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering.Empirical Software Engineering14 (2009), 131–164. https://api.semanticscholar.org/CorpusID:207144526

work page 2009
[53]

Yusei Sakuraba, Hiroki Inayoshi, Shoichi Saito, and Akito Monden. 2025. Plaintext in the Wild: Investigating Secure Connection Label Accuracy for Android Apps. In2025 IEEE International Conference on Source Code Analysis & Manipulation (SCAM). 145–156. https://doi.org/10.1109/SCAM67354.2025.00022

work page doi:10.1109/scam67354.2025.00022 2025
[54]

Sk Golam Saroar and Maleknaz Nayebi. 2023. Developers’ Perception of GitHub Actions: A Survey Analysis. arXiv:2303.04084 [cs.SE] https://arxiv.org/abs/2303. 04084

work page arXiv 2023
[55]

Grishma Shrestha, Shristi Shrestha, and Anas Mahmoud. 2025. No Country for Indie Developers: A Study of Google Play’s Closed Testing Requirements for New Personal Developer Accounts.ACM Trans. Softw. Eng. Methodol.(May 2025). https://doi.org/10.1145/3736578 Just Accepted

work page doi:10.1145/3736578 2025
[56]

Breaux, and Jianwei Niu

Rocky Slavin, Xiaoyin Wang, Mitra Bokaei Hosseini, James Hester, Ram Krishnan, Jaspreet Bhatia, Travis D. Breaux, and Jianwei Niu. 2016. Toward a framework for detecting privacy policy violations in android application code. InProceedings of the 38th International Conference on Software Engineering(Austin, Texas)(ICSE ’16). Association for Computing Machi...

work page doi:10.1145/2884781.2884855 2016
[57]

Mohammad Tahaei, Kami Vaniea, and Naomi Saphra. 2020. Understanding Privacy-Related Questions on Stack Overflow. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https: //doi.org/10.1145/3313831.3376768

work page doi:10.1145/3313831.3376768 2020
[58]

Zeya Tan and Wei Song. 2023. PTPDroid: Detecting Violated User Privacy Disclosures to Third-Parties of Android Apps. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 473–485. https://doi.org/10.1109/ ICSE48619.2023.00050

work page arXiv 2023
[59]

Breaux, and Jianwei Niu

Xiaoyin Wang, Xue Qin, Mitra Bokaei Hosseini, Rocky Slavin, Travis D. Breaux, and Jianwei Niu. 2018. GUILeak: Tracing Privacy Policy Claims on User Input Data for Android Applications. InProceedings of the 40th International Conference on Software Engineering(Gothenburg, Sweden)(ICSE ’18). Association for Com- puting Machinery, New York, NY, USA, 37–47. h...

work page doi:10.1145/3180155 2018
[60]

Le Yu, Xiapu Luo, Xule Liu, and Tao Zhang. 2016. Can We Trust the Privacy Policies of Android Apps?. In2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 538–549. https://doi.org/10.1109/ DSN.2016.55

work page 2016
[61]

Bellovin, and Joel Reidenberg

Sebastian Zimmeck, Ziqi Wang, Lieyong Zou, Roger Iyengar, Bin Liu, Florian Schaub, Shomir Wilson, Norman Sadeh, Steven M. Bellovin, and Joel Reidenberg. 2017.Automated Analysis of Privacy Requirements for Mobile Apps. Korea Society of Internet Information, Korea, Republic of. https://doi.org/10.14722/ndss.2017. 23034 A Codebook In this section, we present...

work page doi:10.14722/ndss.2017 2017

[1] [1]

The European parliament and the council of the European union

2018. The European parliament and the council of the European union. General Data Protection Regulation (GDPR). Retrieved Oct 13, 2025 from https://eur- lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679

work page 2018

[2] [2]

GDPR Article 13

2018. GDPR Article 13. Retrieved Oct 13, 2025 from https://gdpr-info.eu/art-13- gdpr/

work page 2018

[3] [3]

GDPR Article 4

2018. GDPR Article 4. Retrieved Oct 13, 2025 from https://gdpr-info.eu/art-4- gdpr/

work page 2018

[4] [4]

GDPR penalties

2018. GDPR penalties. Retrieved Oct 13, 2025 from https://gdpr-info.eu/issues/ fines-penalties/

work page 2018

[5] [5]

2022. AdMob. Retrieved Oct 13, 2025 from https://admob.google.com/home/

work page 2022

[6] [6]

AppLovin

2022. AppLovin. Retrieved Oct 13, 2025 from https://www.applovin.com/

work page 2022

[7] [7]

Data Safety Section

2022. Data Safety Section. Retrieved Oct 13, 2025 from https://blog.google/ products/google-play/data-safety/

work page 2022

[8] [8]

Get more information about your apps in Google Play

2022. Get more information about your apps in Google Play. Retrieved Oct 20, 2025 from https://blog.google/products/google-play/data-safety/ Challenges in Android Data Disclosure: An Empirical Study MOBILESoft ’26, April 12, 2026, Rio de Janeiro, Brazil

work page 2022

[9] [9]

Google’s data types for DSS

2022. Google’s data types for DSS. Retrieved Oct 13, 2025 from https://support.google.com/googleplay/android-developer/answer/10787469# zippy=%2Cdata-types

work page arXiv 2022

[10] [10]

Privado.ai

2022. Privado.ai. Retrieved Oct 13, 2025 from https://www.privado.ai/data- safety-report

work page 2022

[11] [11]

Signal Private Messenger

2022. Signal Private Messenger. Retrieved Oct 13, 2025 from https://play.google. com/store/apps/datasafety?id=org.thoughtcrime.securesms&hl=en

work page 2022

[12] [12]

2022. Unity. Retrieved Oct 13, 2025 from https://unity.com/

work page 2022

[13] [13]

See No Evil: Loopholes in Google’s Data Safety Labels Keep Companies in the Clear and Consumers in the Dark

2023. See No Evil: Loopholes in Google’s Data Safety Labels Keep Companies in the Clear and Consumers in the Dark. Retrieved Oct 13, 2025 from https: //foundation.mozilla.org/en/campaigns/googles-data-safety-labels/

work page 2023

[14] [14]

Google Play’s Security: 2.36 Million Apps Blocked For Violations In 2024

2024. Google Play’s Security: 2.36 Million Apps Blocked For Violations In 2024. Retrieved Oct 13, 2025 from https://www.forbes.com/sites/alexvakulov/2025/02/ 02/google-plays-security-236m-apps-blocked-for-violations-in-2024/

work page 2024

[15] [15]

App Store

2025. App Store. Retrieved Oct 13, 2025 from https://www.apple.com/app-store/

work page 2025

[16] [16]

Retrieved Oct 13, 2025 from https://www.codeproject.com/

2025.Code Project. Retrieved Oct 13, 2025 from https://www.codeproject.com/

work page 2025

[17] [17]

Retrieved Oct 13, 2025 from https://dzone.com/

2025.DZone. Retrieved Oct 13, 2025 from https://dzone.com/

work page 2025

[18] [18]

Firebase Analytics

2025. Firebase Analytics. Retrieved Oct 13, 2025 from https://firebase.google. com/docs/analytics/get-started?platform=web

work page 2025

[19] [19]

GitHub API

2025. GitHub API. Retrieved Oct 13, 2025 from https://docs.github.com/en/rest? apiVersion=2022-11-28

work page 2025

[20] [20]

Google Play Console

2025. Google Play Console. Retrieved Oct 13, 2025 from https://play.google. com/console/signup

work page 2025

[21] [21]

Google Play Store

2025. Google Play Store. Retrieved Oct 13, 2025 from https://play.google.com/ store/

work page 2025

[22] [22]

Retrieved Oct 13, 2025 from https://news.ycombinator.com/

2025.Hacker news. Retrieved Oct 13, 2025 from https://news.ycombinator.com/

work page 2025

[23] [23]

New Android app releases per month

2025. New Android app releases per month. Retrieved Oct 13, 2025 from https://www.appbrain.com/stats/number-of-android-apps

work page 2025

[24] [24]

2025. Notifee. Retrieved Oct 13, 2025 from https://notifee.app/

work page 2025

[25] [25]

librosa/librosa: 0.6.3,

2025. Paper artifacts. Retrieved Oct 22, 2025 from https://doi.org/10.5281/zenodo. 17416187

work page doi:10.5281/zenodo 2025

[26] [26]

Reddit API

2025. Reddit API. Retrieved Oct 13, 2025 from https://www.reddit.com/dev/api/

work page 2025

[27] [27]

Retrieved Oct 13, 2025 from https://www.sitepoint.com/

2025.Site point. Retrieved Oct 13, 2025 from https://www.sitepoint.com/

work page 2025

[28] [28]

Stack Exchange API

2025. Stack Exchange API. Retrieved Oct 13, 2025 from https://api.stackexchange. com/

work page 2025

[29] [29]

Retrieved Oct 13, 2025 from https://xdaforums.com/

2025.XDA forums. Retrieved Oct 13, 2025 from https://xdaforums.com/

work page 2025

[30] [30]

Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, and Carmine Spagnuolo

work page

[31] [31]

InCompanion Proceedings of the ACM Web Conference 2023(Austin, TX, USA)(WWW ’23 Companion)

The Age of Snippet Programming: Toward Understanding Developer Communities in Stack Overflow and Reddit. InCompanion Proceedings of the ACM Web Conference 2023(Austin, TX, USA)(WWW ’23 Companion). Association for Computing Machinery, New York, NY, USA, 1218–1224. https://doi.org/10. 1145/3543873.3587673

work page arXiv 2023

[32] [32]

Nia Castelly and Fergus Hurley. 2022. Introducing Checks: simplifying privacy for app developers - Google: The Keyword. Retrieved April 10, 2025 from https://blog.google/technology/area-120/checks/

work page 2022

[33] [33]

In: Teo T, editor

Victoria Clarke and Virginia Braun. 2014.Thematic Analysis. Springer New York, New York, NY, 1947–1952. https://doi.org/10.1007/978-1-4614-5583-7_311

work page doi:10.1007/978-1-4614-5583-7_311 2014

[34] [34]

Juliet Corbin and Anselm Strauss. 1990. Grounded Theory Research: Procedures, Canons and Evaluative Criteria.Zeitschrift für Soziologie19, 6 (1990), 418–427. https://doi.org/doi:10.1515/zfsoz-1990-0602

work page doi:10.1515/zfsoz-1990-0602 1990

[35] [35]

Davis, and Chris Brown

Lucas Franke, Huayu Liang, Sahar Farzanehpour, Aaron Brantly, James C. Davis, and Chris Brown. 2024. An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software. In Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement(Barcelona, Spain)(ESEM ’24). A...

work page arXiv 2024

[36] [36]

Aniketh Girish, Joel Reardon, Juan Tapiador, Srdjan Matic, and Narseo Vallina- Rodriguez. 2025. Your Signal, Their Data: An Empirical Privacy Analysis of Wireless-scanning SDKs in Android. arXiv:2503.15238 [cs.CR] https://arxiv.org/ abs/2503.15238

work page arXiv 2025

[37] [37]

Sandra Höltervennhoff, Noah Wöhler, Arne Möhle, Marten Oltrogge, Yasemin Acar, Oliver Wiese, and Sascha Fahl. 2024. A Mixed-Methods Study on User Experiences and Challenges of Recovery Codes for an End-to-End Encrypted Service. In33rd USENIX Security Symposium (USENIX Security 24). USENIX As- sociation, Philadelphia, PA, 7267–7284. https://www.usenix.org/...

work page 2024

[38] [38]

Hiroki Inayoshi, Shohei Kakei, and Shoichi Saito. 2024. Detection of Incon- sistencies between Guidance Pages and Actual Data Collection of Third-party SDKs in Android Apps. InProceedings of the IEEE/ACM 11th International Con- ference on Mobile Software Engineering and Systems(Lisbon, Portugal)(MOBILE- Soft ’24). Association for Computing Machinery, New ...

work page doi:10.1145/3647632.3647991 2024

[39] [39]

Tahira Iqbal, Moniba Khan, Kuldar Taveter, and Norbert Seyff. 2021. Mining Reddit as a New Source for Software Requirements. In2021 IEEE 29th International Requirements Engineering Conference (RE). 128–138. https://doi.org/10.1109/ RE51729.2021.00019

work page arXiv 2021

[40] [40]

Patrick Gage Kelley, Joanna Bresee, Lorrie Faith Cranor, and Robert W. Reeder

work page

[41] [41]

nutrition label

A "nutrition label" for privacy. InProceedings of the 5th Symposium on Usable Privacy and Security(Mountain View, California, USA)(SOUPS ’09). Association for Computing Machinery, New York, NY, USA, Article 4, 12 pages. https: //doi.org/10.1145/1572532.1572538

work page doi:10.1145/1572532.1572538

[42] [42]

Patrick Gage Kelley, Lorrie Faith Cranor, and Norman Sadeh. 2013. Privacy as part of the app decision-making process. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Paris, France)(CHI ’13). Association for Computing Machinery, New York, NY, USA, 3393–3402. https://doi.org/10.1145/ 2470654.2466466

work page arXiv 2013

[43] [43]

Rishabh Khandelwal, Asmit Nayak, Paul Chung, and Kassem Fawaz. 2023. Un- packing Privacy Labels: A Measurement and Developer Perspective on Google’s Data Safety Section. arXiv:2306.08111 [cs.CY]

work page arXiv 2023

[44] [44]

Mugdha Khedkar, Ambuj Kumar Mondal, and Eric Bodden. 2026. A Study of Privacy-Related Data Collected by Android Apps.Automated Software Engineer- ing33, 2 (2026), 45. https://doi.org/10.1007/s10515-025-00589-3

work page doi:10.1007/s10515-025-00589-3 2026

[45] [45]

Mugdha Khedkar, Ambuj Kumar Mondal, and Eric Bodden. 2024. Do Android App Developers Accurately Report Collection of Privacy-Related Data?. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops(Sacramento, CA, USA)(ASEW ’24). Association for Computing Ma- chinery, New York, NY, USA, 176–186. https://doi.or...

work page doi:10.1145/3691621.3694949 2024

[46] [46]

Ruiyin Li, Peng Liang, Mohamed Soliman, and Paris Avgeriou. 2021. Under- standing Architecture Erosion: The Practitioners’ Perceptive. In2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). 311–322. https: //doi.org/10.1109/ICPC52881.2021.00037

work page doi:10.1109/icpc52881.2021.00037 2021

[47] [47]

Tianshi Li, Lorrie Faith Cranor, Yuvraj Agarwal, and Jason I. Hong. 2024. Matcha: An IDE Plugin for Creating Accurate Privacy Nutrition Labels.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 1 (March 2024), 1–38. https://doi.org/10.1145/3643544

work page doi:10.1145/3643544 2024

[48] [48]

Tianshi Li, Kayla Reiman, Yuvraj Agarwal, Lorrie Faith Cranor, and Jason I. Hong

work page

[49] [49]

InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22)

Understanding Challenges for Developers to Create Accurate Privacy Nutrition Labels. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 588, 24 pages. https://doi.org/10.1145/ 3491102.3502012

work page arXiv 2022

[50] [50]

Anthony Peruma, Timothy Huo, Ana Catarina Araújo, Jake Imanaka, and Rick Kazman. 2024. A Developer-Centric Study Exploring Mobile Application Security Practices and Challenges. arXiv:2408.09032 [cs.CR] https://arxiv.org/abs/2408. 09032

work page arXiv 2024

[51] [51]

Del Alamo, and Norman Sadeh

David Rodriguez, Akshath Jain, Jose M. Del Alamo, and Norman Sadeh. 2023. Comparing Privacy Label Disclosures of Apps Published in both the App Store and Google Play Stores. In2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 150–157. https://doi.org/10.1109/EuroSPW59978.2023. 00022

work page doi:10.1109/eurospw59978.2023 2023

[52] [52]

Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering.Empirical Software Engineering14 (2009), 131–164. https://api.semanticscholar.org/CorpusID:207144526

work page 2009

[53] [53]

Yusei Sakuraba, Hiroki Inayoshi, Shoichi Saito, and Akito Monden. 2025. Plaintext in the Wild: Investigating Secure Connection Label Accuracy for Android Apps. In2025 IEEE International Conference on Source Code Analysis & Manipulation (SCAM). 145–156. https://doi.org/10.1109/SCAM67354.2025.00022

work page doi:10.1109/scam67354.2025.00022 2025

[54] [54]

Sk Golam Saroar and Maleknaz Nayebi. 2023. Developers’ Perception of GitHub Actions: A Survey Analysis. arXiv:2303.04084 [cs.SE] https://arxiv.org/abs/2303. 04084

work page arXiv 2023

[55] [55]

Grishma Shrestha, Shristi Shrestha, and Anas Mahmoud. 2025. No Country for Indie Developers: A Study of Google Play’s Closed Testing Requirements for New Personal Developer Accounts.ACM Trans. Softw. Eng. Methodol.(May 2025). https://doi.org/10.1145/3736578 Just Accepted

work page doi:10.1145/3736578 2025

[56] [56]

Breaux, and Jianwei Niu

Rocky Slavin, Xiaoyin Wang, Mitra Bokaei Hosseini, James Hester, Ram Krishnan, Jaspreet Bhatia, Travis D. Breaux, and Jianwei Niu. 2016. Toward a framework for detecting privacy policy violations in android application code. InProceedings of the 38th International Conference on Software Engineering(Austin, Texas)(ICSE ’16). Association for Computing Machi...

work page doi:10.1145/2884781.2884855 2016

[57] [57]

Mohammad Tahaei, Kami Vaniea, and Naomi Saphra. 2020. Understanding Privacy-Related Questions on Stack Overflow. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https: //doi.org/10.1145/3313831.3376768

work page doi:10.1145/3313831.3376768 2020

[58] [58]

Zeya Tan and Wei Song. 2023. PTPDroid: Detecting Violated User Privacy Disclosures to Third-Parties of Android Apps. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 473–485. https://doi.org/10.1109/ ICSE48619.2023.00050

work page arXiv 2023

[59] [59]

Breaux, and Jianwei Niu

Xiaoyin Wang, Xue Qin, Mitra Bokaei Hosseini, Rocky Slavin, Travis D. Breaux, and Jianwei Niu. 2018. GUILeak: Tracing Privacy Policy Claims on User Input Data for Android Applications. InProceedings of the 40th International Conference on Software Engineering(Gothenburg, Sweden)(ICSE ’18). Association for Com- puting Machinery, New York, NY, USA, 37–47. h...

work page doi:10.1145/3180155 2018

[60] [60]

Le Yu, Xiapu Luo, Xule Liu, and Tao Zhang. 2016. Can We Trust the Privacy Policies of Android Apps?. In2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 538–549. https://doi.org/10.1109/ DSN.2016.55

work page 2016

[61] [61]

Bellovin, and Joel Reidenberg

Sebastian Zimmeck, Ziqi Wang, Lieyong Zou, Roger Iyengar, Bin Liu, Florian Schaub, Shomir Wilson, Norman Sadeh, Steven M. Bellovin, and Joel Reidenberg. 2017.Automated Analysis of Privacy Requirements for Mobile Apps. Korea Society of Internet Information, Korea, Republic of. https://doi.org/10.14722/ndss.2017. 23034 A Codebook In this section, we present...

work page doi:10.14722/ndss.2017 2017