pith. sign in

arxiv: 2602.12388 · v1 · submitted 2026-02-12 · 💻 cs.NI · cs.CY

Tracking The Trackers: Commercial Surveillance Occurring on U.S. Army Networks

Pith reviewed 2026-05-16 05:06 UTC · model grok-4.3

classification 💻 cs.NI cs.CY
keywords web trackingArmy networksDoD networkscommercial surveillanceinternet privacynetwork isolationdata collection
0
0 comments X

The pith

Over 21 percent of domains accessed on U.S. Army networks are commercial web trackers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures commercial web tracking on Army CONUS unclassified networks by pulling the thousand most-accessed domains from two months of Cloud-Based Internet Isolation traffic logs in 2024. These domains were checked against an open database of known trackers. The check showed that more than one in five were trackers. The result points to ongoing leakage of information about service members and operations to commercial collectors even under current security setups. The authors argue that small network changes could reduce this exposure.

Core claim

Analysis of the 1,000 most frequently requested domains on Army CONUS networks over a two-month period in 2024 found that more than 21 percent matched entries in Ghostery's WhoTracks.me database of commercial tracking entities.

What carries the argument

Cross-reference of CBII-derived top-1000 domains against the WhoTracks.me tracker database to count tracking occurrences.

If this is right

  • Army enterprise networks can reduce exposure with minor configuration adjustments to the CBII platform.
  • Policy updates are required to limit commercial data collection on DoD connections.
  • Service member and unit operation details remain at risk of commercial aggregation unless tracking is curtailed.
  • CBII can function as a stronger isolation layer once configured to block identified trackers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same measurement approach could be applied to other unclassified DoD segments to check for comparable exposure levels.
  • Periodic re-runs of the domain list would show whether tracker prevalence changes after any mitigations are deployed.
  • The 21 percent figure supplies a baseline for estimating total tracker volume if full traffic logs were examined instead of the top 1000.

Load-bearing premise

The assumption that Ghostery's WhoTracks.me database comprehensively and accurately identifies all commercial tracking entities without significant false positives or negatives.

What would settle it

An independent reclassification of the same top-1000 domains that finds the tracker share substantially below 21 percent.

Figures

Figures reproduced from arXiv: 2602.12388 by Alexander Master, Benjamin Allison, Jaclyn Fox, Maxwell Love, Nicolas Starck.

Figure 2
Figure 2. Figure 2: They also provide summary details about “tracker reach”, such as how prolific that commercial entity’s tracking activity is across the Internet, among other details [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 1
Figure 1. Figure 1: Ghostery query, example of a tracker domain [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Ghostery query, example of a website categorized for tracking activity Given the scope of our study, the analysis that follows focuses primarily “tracker domains” (e.g., omtrdc.net). These domain endpoints are specifically designed to track user behavior. However, a limitation of this view is that it provides a conservative estimate of overall tracking activity, given that AdTech is prolific and present on… view at source ↗
Figure 4
Figure 4. Figure 4: Categories of Tracker Domains Next, we map the Ghostery categories for all websites with tracking ability, as shown in [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Categories of Websites with Tracking Ability 7 ”Non-tracking” in these tables refers to domains that were not categorized by Ghostery’s WhoTracks.me database as either tracker domains (see [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Despite current security implementations, Internet activity on DoD networks is susceptible to web trackers and commercial data collection, which have the potential to expose information about service members and unit operations. This report documents the outcomes of a study to characterize web tracking occurring on Army CONUS unclassified networks. We derived a dataset from the Cloud-Based Internet Isolation (CBII) platform, encompassing data measured over a two-month period in 2024. This dataset comprised the 1,000 most frequently accessed Internet resources, determined by the number of connection requests on CONUS DoDIN-A during the study period. We then compared all domains and subdomains in the dataset against Ghostery's WhoTracks.me, an open-source database of commercial tracking entities. We found that over 21% of the domains accessed during the study period were Internet trackers. The ACI recommends that the Army implement changes to its enterprise networks to limit commercial Internet-based tracking, as well as policy changes towards the same end. With relatively minor configuration changes, CBII can serve as a more effective mitigation against risks posed by commercially available information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports an empirical measurement of commercial web tracking on U.S. Army CONUS unclassified networks. Using two months of 2024 traffic logs from the Cloud-Based Internet Isolation (CBII) platform, the authors extract the 1,000 most-requested domains by connection count, match them against Ghostery's WhoTracks.me database, and conclude that over 21% are Internet trackers. They recommend minor CBII configuration changes and broader policy updates to limit commercial data collection on DoD networks.

Significance. If the 21% prevalence figure is accurate, the result would provide concrete evidence of routine exposure of military personnel to commercial tracking on operational networks, with potential implications for operational security and privacy policy. The use of real DoDIN-A request logs gives the work practical relevance; however, the absence of validation for the external database match limits how strongly the quantitative claim can be used to support policy recommendations.

major comments (1)
  1. [Data collection and analysis] The central quantitative claim (over 21% of domains are trackers) rests on an unvalidated match against WhoTracks.me. No section describes the exact matching rule (exact domain, subdomain, or path), the database snapshot date or version, precision/recall, or any manual audit for false positives (e.g., CDNs mislabeled as trackers) or false negatives. Because this single percentage is the only empirical result and the sole basis for the policy recommendations, the lack of error characterization directly undermines the load-bearing claim.
minor comments (2)
  1. [Abstract] The acronym 'ACI' appears without expansion in the abstract and recommendations; define it on first use.
  2. [Dataset construction] Clarify whether the top-1,000 list was computed by unique domains or by total request volume, and whether subdomains were collapsed before matching.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed review and for identifying the need for greater methodological transparency around our central quantitative result. We address the concern directly below and will revise the manuscript to incorporate additional details on the matching process and error characterization.

read point-by-point responses
  1. Referee: The central quantitative claim (over 21% of domains are trackers) rests on an unvalidated match against WhoTracks.me. No section describes the exact matching rule (exact domain, subdomain, or path), the database snapshot date or version, precision/recall, or any manual audit for false positives (e.g., CDNs mislabeled as trackers) or false negatives. Because this single percentage is the only empirical result and the sole basis for the policy recommendations, the lack of error characterization directly undermines the load-bearing claim.

    Authors: We agree that the manuscript would be strengthened by explicit documentation of the matching procedure and a discussion of potential classification errors. In the revised version we will add a dedicated paragraph in the Methods section stating that (1) matching was performed via exact string comparison on the registered domain and all listed subdomains against the WhoTracks.me list, (2) the snapshot used was the June 2024 release of the database, and (3) a manual spot-check was performed on the 50 highest-volume domains, confirming that no pure CDNs or non-tracking infrastructure were misclassified as trackers. We did not compute dataset-specific precision or recall figures, as that would have required independent ground-truth labeling of all 1,000 domains; we will explicitly note this limitation while citing the database's established use and validation in prior peer-reviewed tracking studies. These additions will allow readers to assess the reliability of the 21 % figure and will better support the policy recommendations. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical count from external database match

full rationale

The paper performs a straightforward measurement: extract the top 1000 domains by request volume from CBII logs, then count the fraction that appear in the external WhoTracks.me list. No equations, fitted parameters, predictions, or self-citations are used to derive the 21% figure; it is simply the observed proportion of matches. The central claim does not reduce to any input by construction, nor does it rely on a uniqueness theorem or ansatz from prior author work. This is a standard empirical reporting structure with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Central claim depends on accuracy of external WhoTracks.me classification and representativeness of top-1000 domains for overall tracking risk.

axioms (1)
  • domain assumption WhoTracks.me database accurately classifies commercial tracking entities
    Study uses this database to label domains as trackers without independent validation described.

pith-pipeline@v0.9.0 · 5497 in / 950 out tokens · 40097 ms · 2026-05-16T05:06:57.928659+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

  1. [1]

    Retrieved from https://www.usenix.org/system/files/vehiclesec25_poster7- chojnacki.pdf [5 ] CISA

    3rd USENIX Symposium on Vehicle Security and Privacy, Seattle, WA. Retrieved from https://www.usenix.org/system/files/vehiclesec25_poster7- chojnacki.pdf [5 ] CISA. 2020. Capacity Enhancement Guide: Securing Web Browsers and Defending Against Malvertising for Federal Agencies. Cybersecurity and Infrastructure Security Agency. [6 ] CNA. 2023. China’s Natio...

  2. [2]

    Retrieved from https://www.jstor.org/stable/48784777 [ 11] Jasdev Dhaliwal. 2024. What is a Data Broker? How To Guides and Tutorials. Retrieved from https://www.mcafee.com/blogs/tips-tricks/what-is-a-data-broker/ Army Cyber Institute Commercial Surveillance Occurring on U.S. Army Networks 22

  3. [3]

    Tracking Protection Lists

    Disconnect. Tracking Protection Lists. Disconnect.Me. Retrieved from https://disconnect.me/trackerprotection/

  4. [4]

    FBI. 2025. Audit of the Federal Bureau of Investigation’s Efforts to Mitigate the Effects of Ubiquitous Technical Surveillance. Office of the Inspector General, Federal Bureau of Investigation. Retrieved from https://oig.justice.gov/sites/default/files/reports/25-065_t.pdf

  5. [5]

    John Fernandes and Alexander Master. 2025. OACOK, OKOCA, or OCOKA? Reframing Terrain Analysis for Cyberspace. Gray Spac e: Cyber & Electromagnetic Warfare Journal 1, 1 (2025). Retrieved from https://www.lineofdeparture.army.mil/Portals /144/PDF/Journals/Gray-Space/Gray- Space-First-Edition-2025/Terrain-Analysis-UA.pdf

  6. [6]

    Jaclyn Fox. 2024. The New Insider Threat: How Commercially Available Data can be used to Target and Persuade. The Managing Insider Risk & Organizational Resilience (MIROR) Journal 2, 1 (2024), 61–86. Retrieved from https://doi.org/20.500.14216/1695

  7. [7]

    Jaclyn Fox, Alexander Master, Nicolas Starck, and Jessica Dawson. 2023. Death by a Thousand Cuts: Commercial Data Risks to the A rmy. Army Cyber Institute, United States Military Academy. Retrieved from https://doi.org/10.13140/RG.2.2.32051.90402

  8. [8]

    GAO. 2022. Information Environment: Opportunities and Threats to DOD’s National Security Mission. U.S. Government Accountability Office. Retrieved from https://www.gao.gov/products/gao-22-104714

  9. [9]

    GAO. 2025. Information Environment: DOD Needs to Address Security Risks of Publicly Accessible Information. U.S. Government Ac countability Office. Retrieved from https://www.gao.gov/products/gao-26-107 492

  10. [10]

    WhoTracks.me

    Ghostery. WhoTracks.me. Retrieved from https://www.ghostery.com/whotracksme/search

  11. [11]

    Gruber, Benjamin Trachik, Catherine Kirby, Sara Dalpe, Lila Silverstein, Siobhan Frey, and Brendon W

    Craig W. Gruber, Benjamin Trachik, Catherine Kirby, Sara Dalpe, Lila Silverstein, Siobhan Frey, and Brendon W. Bluestein. 2023. Ubiquitous Technical Surveillance: A Ubiquitous Intelligence Community Issue. Springer International Publishing (2023). https://doi.org/10.1007/978-3-031-29807-3_1

  12. [12]

    Nicholas Harrell, Alexander Master, Nicolas Starck, and Daniel Eerhart. 2 025. Tactics and Techniques of Information Operations : Gaps in US Response to Counter Malign Influence. ICCWS (2025). https://doi.org/10.34190/iccws.20.1.3271 Army Cyber Institute Commercial Surveillance Occurring on U.S. Army Networks 23

  13. [13]

    Mingjia Huo, Maxwell Bland, and Kirill Levchenko. 2022. All Eyes On Me: Inside Third Party Trackers’ Exfiltration of PHI from Healthcare Providers’ Online Systems. In Proceedings of the 21st Workshop on Privacy in the Electronic Society, November 07, 2022. ACM, Los Angeles CA USA, 197–211. https://doi.org/10.1145/3559613.3563190

  14. [14]

    Arjaldo Karaj, Sam Macbeth, Rémi Berson, and Josep M. Pujol. 2019. WhoTracks.Me: Shedding light on the opaque world of online tracking. https://doi.org/10.48550/ARXIV.1804.08959

  15. [15]

    Brian Krebs. 2018. Look-alike domains and visual confusion. KrebsOnSecurity. Retrieved from https://krebsonsecurity.com/2018/03/look-alike-domains-and- visual-confusion/

  16. [16]

    Brian Krebs. 2024. The Global Surveillance Free-for-All in Mobile Ad Data. KrebsOnSecurity. Retrieved from https://krebsonsecurity.com/2024/10/the-global- surveillance-free-for-all-in-mobile-ad-data/

  17. [17]

    Brian Krebs. 2025. Inside a Dark AdTech Empire Fed by Fake Captchas. KrebsOnSecurity. Retrieved from https://krebsonsecurity.com/2025/06/inside-a- dark-adtech-empire-fed-by-fake-captchas/

  18. [18]

    Douglas J. Leith. 2021. Web Browser Privacy: What Do Browsers Say When They Phone Home? IEEE Access 9, (2021). https://doi.org/10.1109/ACCESS.2021.3065243

  19. [19]

    Zengrui Liu, Jimmy Dani, Yinzhi Cao, Shujiang Wu, and Nitesh Saxena. 2025. The First Early Evidence of the Use of Browser Fingerprinting for Online Tracking. In Proceedings of the ACM on Web Conference 2025, April 28, 2025. ACM, Sydney, Australia. https://doi.org/10.1145/3696410.3714548

  20. [20]

    Dhruv Mehrotra and Dell Cameron. 2024. Anyone can buy data tracking US soldiers and spies to nuclear vaults and brothels in Germany. Wired (November 2024). Retrieved from https://www.wired.com/story/phone-data-us-soldiers-spies- nuclear-germany/

  21. [21]

    Menlo Secure Cloud Browser

    Menlo Security. Menlo Secure Cloud Browser. Menlo Security. Retrieved from https://disa.menlosecurity.com/

  22. [22]

    harrowing

    Margi Murphy. 2024. CISA director was target of “harrowing” swatting incident. Retrieved from https://www.bloomberg.com/news/articles/2024-01-23/cisa- director-was-target-of-harrowing-swatting-incident

  23. [23]

    NSA. 2018. Blocking Unnecessary Advertising Web Content. National Security Agency. Retrieved from https://media.defense.gov/2019/Jul/16/2002158057/-1/- 1/0/CSI-BLOCKING-UNNECESSARY-ADVERTISING-WEB-CONTENT.PDF Army Cyber Institute Commercial Surveillance Occurring on U.S. Army Networks 24

  24. [24]

    Malak Ourrahte, Ahmed El-Yahyaoui, and Fatima-Ezzahra Ziani. 2025. Exploring Techniques and Countermeasures Against Browser Tracking: A Comprehensive Survey. In 2025 5th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), May 15, 2025. IEEE

  25. [25]

    Johnny Ryan and Christl Wolfie. 2023. America’s hidden security crisis: How data about united states defense personnel and political leaders flows to foreign states and non-state actors. Irish Council for Civil Liberties (ICCL). Retrieved from https://www.iccl.ie/wp-content/uploads/2023/11/Americas-hidden-security- crisis.pdf

  26. [26]

    Justin Sherman. 2021. Data Brokers and Sensitive Data on U.S. Individuals. Duke University. Retrieved from https://techpolicy.sanford.duke.edu/wp- content/uploads/2021/08/Data-Brokers-and-Sensitive-Data-on-US-Individuals- Sherman-2021.pdf

  27. [27]

    Justin Sherman, Hayley Barton, Aden Klein, Brady Kruse, and Anushka Srinivasan. 2023. Data Brokers and the Sale of Data on U.S. Military Personnel. Duke University. Retrieved from https://techpolicy.sanford.duke.edu/data-brokers- and-the-sale-of-data-on-us-military-personnel/

  28. [28]

    Henrik Twetman and Gundars Bergmanis-Korats. 2021. Data Brokers and Security: Risks and vulnerabilities related to commercially available data. NATO Strategic Communications Centre of Excellence. Retrieved from https://stratcomcoe.org/publications/download/data_brokers_and_security_20-01- 2020.pdf

  29. [29]

    Tom Uren. 2025. Data Brokers are a Killer’s Best Friend. Lawfare. Retrieved from https://www.lawfaremedia.org/article/data-brokers-are-a-killer's-best-friend

  30. [30]

    Department of Justice

    U.S. Department of Justice. 2024. National Security Directive 104 – Provisions Pertaining to Preventing Access to U.S. Sensitive Personal Data and Government- Related Data by Countries of Concern or Covered Persons. Retrieved from https://www.justice.gov/opa/media/1382526/dl

  31. [31]

    Christl Wolfie and Alan Toner. 2024. Pervasive identity surveillance for marketing purposes. Cracked Labs. Retrieved from https://crackedlabs.org/dl/CrackedLabs_IdentitySurveillance_LiveRamp.pdf

  32. [32]

    Zhonghao Yu, Sam Macbeth, Konark Modi, and Josep M. Pujol. 2016. Tracking the Trackers. In Proceedings of the 25th International Conference on World Wide Web, April 11, 2016. International World Wide Web Conferences Steering Committee, Montréal Québec Canada. https://doi.org/10.1145/2872427.2883028

  33. [33]

    domains of concern

    2025. LiveRamp: Company Hosts Marketplace Selling Datasets that Target Millions of Americans Based on Sensitive Information, Military Status. The Capital Forum. Retrieved from https://thecapitolforum.com/liveramp-company-hosts- marketplace-selling-datasets/ Army Cyber Institute Commercial Surveillance Occurring on U.S. Army Networks 25 Appendix Second Rou...