Expecting (Targeted Ads)? Network Analysis of User Health Data Leakage in Fertility Tracking Apps

Adam Bates; Brad Reaves; Camille Cobb; Mahnoor Jameel; Shahanaasree Sivakumar; Yeeun Jo

arxiv: 2606.26276 · v3 · pith:F2UAOOECnew · submitted 2026-06-24 · 💻 cs.CR

Expecting (Targeted Ads)? Network Analysis of User Health Data Leakage in Fertility Tracking Apps

Yeeun Jo , Shahanaasree Sivakumar , Mahnoor Jameel , Camille Cobb , Adam Bates , Brad Reaves This is my paper

Pith reviewed 2026-06-29 04:36 UTC · model grok-4.3

classification 💻 cs.CR

keywords fertility tracking appsdata leakagenetwork measurementtargeted advertisinghealth data privacyAndroid appsmenstrual data

0 comments

The pith

Five fertility tracking apps send users' menstrual and pregnancy data to advertising services.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper performs a network measurement on 20 Android fertility apps to check how user health data flows to third parties during normal use. It records traffic from standardized interactions and finds explicit health data leaks plus targeted ad URLs in five apps. Other apps monetize with ads yet show no such leaks, and a few interact with ad services only minimally. This supplies concrete technical evidence that privacy outcomes depend on which app a user picks. The results matter because fertility data is highly sensitive and users often have little visibility into these flows.

Core claim

After systematizing features across the 20 apps, the study records TLS-stripped network traffic during controlled user interactions and identifies explicit leakage of user health data together with implicit leakage via highly targeted contextual advertising URLs in a subset of five apps.

What carries the argument

Network traffic recording of TLS-stripped requests generated by standardized user interactions across the fertility apps.

If this is right

Some apps achieve ad-based revenue without transmitting identifiable health data.
Privacy differences between apps are observable through network analysis rather than self-reported policies.
Users can avoid certain data flows by selecting apps that show minimal ad-network contact.
Technical measurements can confirm or refute user worries about fertility-app data handling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar network checks could reveal whether leakage patterns appear in other categories of health or period-tracking software.
App stores might surface data-sharing summaries derived from traffic analysis to help users compare options.
Developers of ad-supported health apps could adopt the minimal-interaction patterns observed in the non-leaking examples.

Load-bearing premise

The lab setup with fixed user actions and stripped network captures fully represents the data sharing that occurs in everyday use.

What would settle it

Real-user sessions on the same apps in which no menstrual or pregnancy details appear in requests sent to known ad domains.

Figures

Figures reproduced from arXiv: 2606.26276 by Adam Bates, Brad Reaves, Camille Cobb, Mahnoor Jameel, Shahanaasree Sivakumar, Yeeun Jo.

**Figure 2.** Figure 2: Number of HTTP Requests by interaction session to different advertising network service types. Endpoint Role Requests Perc. Configuration 234 3% Conversion Tracking 146 2% Cookie Synchronization 60 1% Event Tracking 257 3% Get Ad 5,032 64% Impression Tracking 1,022 13% Static Content 979 13% Unclear 99 1% [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Number of HTTP Requests per App and Inter [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: In BabyCenter, the getAdsUserStage function in com.babycenter.pregbaby.api.model.ChildViewModel populates the csw and us custom parameters. for brevity and because they largely self-evident given their correlation with specific interaction sessions. A.1 BabyCenter The getAdsUserStage found in com.babycenter.pregbaby. api.model.ChildViewModel is responsible for populating the csw and us. As the APK retained… view at source ↗

**Figure 6.** Figure 6: In What to Expect, numerous functions in app/src/main/java/com/whattoexpect/ad/AdManager support the construction of query parameters that leak user health data. Expect. Once again, the combination of string literals and function names provides strong evidence that these values are being constructed based on dynamic user inputs. Notably absent from the AdManager code is is explicit logic for constructing c… view at source ↗

read the original abstract

While human factors in the privacy of fertility tracking apps -- health trackers that record users' menstrual or pregnancy data -- has been the subject of extensive study, little attention has been paid to the technical aspects of apps' data handling practices. We conduct a network-based measurement study of a corpus of 20 Android fertility tracking apps from the Google Play Store, focusing on how user data is shared with third party advertising services. After systematizing app features, we conduct a series of standardized user interactions across all apps in an environment that records TLS-stripped network traffic. In a subset of apps (n=5) we identify explicit leakage of user health data as well implicit leakage through highly targeted contextual advertising URL's. Equally importantly, we observe additional apps that use an ad-based monetization model without apparent leakage of user data, as well as several apps the interact only minimally with ad services. These findings provide technical grounding for widespread user concerns, but also underscore the importance of consumer choice in the privacy implications of app-based fertility tracking.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper measures network traffic from 20 fertility apps and reports explicit health data leakage in five plus some implicit ad targeting, but the methods are described at too high a level to assess the results.

read the letter

The main point here is that the authors ran network captures on 20 Android fertility tracking apps during standardized interactions and found explicit user health data leaving in five of them, along with some highly targeted ad URLs that suggest implicit leakage. They also note several apps that use ads without apparent data sharing and a few that barely touch ad services at all.

What the work does is apply standard TLS-stripped traffic recording to this app category after earlier human-factors papers. Systematizing features first and then looking for both explicit and contextual leakage gives some technical grounding that was missing before. The corpus itself is new, so the specific measurements add to the record even if the technique is routine.

The soft spots are mostly about missing detail. The abstract supplies no criteria for choosing the twenty apps, no list of the interaction scripts, and no description of how traffic was classified as health data or targeted. Without those pieces it is hard to know whether the positive findings are solid or whether the apps labeled clean really stayed clean. The stress-test concern lands: a fixed set of lab interactions can miss paths that only appear after longer use, specific health events, or ad-network callbacks. That leaves both the leakage cases and the no-leakage cases open to the completeness of the model.

This is aimed at mobile privacy and security researchers who track data flows in health apps. It is worth sending for peer review because the empirical angle is useful and the topic matters, though any review will need to press for the missing methodological specifics and possibly more varied interaction testing.

Referee Report

2 major / 1 minor

Summary. The manuscript reports a network measurement study of 20 Android fertility tracking apps from the Google Play Store. After systematizing features, the authors perform standardized user interactions in a controlled TLS-stripped traffic environment and identify explicit leakage of user health data plus implicit leakage via targeted contextual advertising URLs in a subset of 5 apps. They also report additional apps that monetize via ads without apparent leakage and several with minimal ad-service interaction, providing technical grounding for privacy concerns while emphasizing consumer choice.

Significance. If the measurements hold, the work supplies direct empirical observations of data flows to third-party ad services in a sensitive health domain. It credits the direct network recording approach and the balanced finding that ad-based monetization does not uniformly imply leakage. The study adds concrete technical evidence to the literature on mobile privacy, though its impact depends on the completeness and reproducibility of the interaction model.

major comments (2)

[Abstract / Methodology] Abstract and Methodology section: the headline claim of explicit leakage in n=5 apps (and absence in others) rests on traffic observed during standardized interactions, yet the manuscript supplies no details on app selection criteria, exact interaction scripts, traffic classification rules, or verification steps. Without these elements the support for the central claim cannot be evaluated.
[Results] Results section: the observations of both leakage and 'no apparent leakage' are sensitive to the coverage of the interaction model. The manuscript does not enumerate or justify how the fixed set of standardized interactions addresses potential conditional paths (cumulative usage history, specific health-event sequences, device state, or ad-network callbacks after prolonged sessions), which directly affects the reliability of the positive and negative findings.

minor comments (1)

[Abstract] Abstract: 'the interact only minimally' appears to be a typographical error and should read 'that interact only minimally'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas where additional detail and discussion will strengthen the manuscript. We address each major comment below.

read point-by-point responses

Referee: [Abstract / Methodology] Abstract and Methodology section: the headline claim of explicit leakage in n=5 apps (and absence in others) rests on traffic observed during standardized interactions, yet the manuscript supplies no details on app selection criteria, exact interaction scripts, traffic classification rules, or verification steps. Without these elements the support for the central claim cannot be evaluated.

Authors: We agree that the manuscript currently lacks these methodological details, which are essential for evaluating and reproducing the central claims. In the revised version we will expand the Methodology section to specify the app selection criteria (top apps by downloads and ratings with feature diversity), provide the exact standardized interaction scripts, detail the traffic classification rules used to identify explicit health data leakage versus targeted ad URLs, and describe the verification steps performed. This will directly support the claims with transparent evidence. revision: yes
Referee: [Results] Results section: the observations of both leakage and 'no apparent leakage' are sensitive to the coverage of the interaction model. The manuscript does not enumerate or justify how the fixed set of standardized interactions addresses potential conditional paths (cumulative usage history, specific health-event sequences, device state, or ad-network callbacks after prolonged sessions), which directly affects the reliability of the positive and negative findings.

Authors: We acknowledge that the reliability of both positive and negative findings is sensitive to interaction coverage and that the manuscript does not explicitly address conditional paths. We will revise the Results section to enumerate the performed interactions, justify their basis in the systematized feature analysis for typical first-use scenarios, and add a limitations discussion that notes the absence of prolonged-session or history-dependent testing while outlining implications for the reported leakage and non-leakage observations. revision: partial

Circularity Check

0 steps flagged

No circularity: direct empirical network measurement with no derivations or self-referential fits

full rationale

The paper is a measurement study that records TLS-stripped network traffic from standardized user interactions in 20 fertility apps and reports observed data flows to third-party services. No equations, parameters, or derivations are present. Claims rest on direct observation of external network behavior rather than any reduction to fitted inputs or self-citation chains. The method's coverage limitations (raised by the skeptic) concern experimental completeness, not circularity in a derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical measurement study; contains no mathematical model, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5729 in / 1053 out tokens · 35811 ms · 2026-06-29T04:36:21.025712+00:00 · methodology

Review history (2 revisions) →

Expecting (Targeted Ads)? Network Analysis of User Health Data Leakage in Fertility Tracking Apps

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)