Recognition: no theorem link
Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots
Pith reviewed 2026-05-14 22:12 UTC · model grok-4.3
The pith
Seventeen of twenty popular AI chatbots share conversation data with third parties.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under controlled settings using a sensitive prompt, network traffic capture shows that 17 of 20 chatbots share information with at least one third party. Three chatbots transmit plaintext conversation text, including prompt and response snippets, to Microsoft Clarity through session replay. Fifteen chatbots forward conversation URLs or chat identifiers to third-party advertising, analytics, or social endpoints, and several expose user identity details such as hashed emails or account identifiers.
What carries the argument
Network traffic capture during normal and private chat sessions that identifies exposure of content (prompts, URLs, identifiers) and identity (cookies, emails, IP fields) to third-party endpoints.
If this is right
- Conversation text can reach analytics providers without explicit user consent.
- Chat URLs and identifiers allow third parties to associate separate conversations with the same user.
- Private chat modes do not reliably prevent third-party receipt of content or identifiers.
- Identity details such as hashed emails can leak through support widgets or analytics tags.
Where Pith is reading between the lines
- Users who discuss medical or financial topics may expose details to advertisers even when they choose private mode.
- Providers could limit exposure by auditing or removing session-replay and analytics scripts from chatbot pages.
- Regulators might examine whether current privacy notices cover third-party sharing of live chat content.
Load-bearing premise
That controlled lab tests with chosen sensitive prompts and network capture represent all tracking that occurs for typical users in real production settings.
What would settle it
Run the same sensitive prompts on one of the 20 chatbots from an ordinary browser, inspect the live network requests, and check whether any third-party payloads contain matching plaintext text or chat identifiers.
Figures
read the original abstract
AI chatbots are becoming a primary interface for seeking information. As their popularity grows, chatbot providers are starting to deploy advertising and analytics. Despite this, tracking on AI chatbots has not been systematically studied. We present a systematic measurement of web tracking on 20 popular AI chatbots. Under controlled settings using a sensitive prompt, we capture and compare network traffic in normal chats and, where supported, private chats. We search for exposure of two categories of information: content, including prompts, prompt-derived titles, chat URLs, and chat identifiers; and identity, including names, emails, account identifiers, first-party cookies, and explicit IP/User-Agent fields in payloads. We find that 17 of 20 chatbots share information with at least one third party. Three chatbots share plaintext conversation text, including both prompt and response snippets, with Microsoft Clarity through session replay. Fifteen chatbots share conversation URLs or chat identifiers with third-party advertising, analytics, or social endpoints. Several chatbots expose user identity through support widgets, analytics, advertising, and session replay tags; in some cases, hashed emails are shared.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports the results of a measurement study that examines web tracking and data exposure on 20 popular AI chatbots. Using controlled experiments with sensitive prompts, the authors capture network traffic to identify sharing of content (prompts, responses, chat URLs, identifiers) and identity information (names, emails, cookies, IPs) with third parties. Key findings include that 17 of the 20 chatbots share information with at least one third party, with three transmitting plaintext conversation text to Microsoft Clarity, and fifteen sharing conversation URLs or identifiers with advertising, analytics, or social endpoints.
Significance. This study is significant as it provides the first systematic empirical evidence of tracking practices in AI chatbots, a rapidly growing user interface. The direct observation of network traffic under controlled conditions yields concrete, falsifiable results about data exposure, including specific examples of plaintext sharing. These findings highlight privacy vulnerabilities that could inform better design practices, user awareness, and regulatory efforts in the field of AI privacy and security.
minor comments (2)
- The methodology would benefit from an explicit list or table enumerating the 20 chatbots and the criteria used for their selection to improve reproducibility.
- Results on private-chat mode (mentioned in the abstract) should be presented with the same level of per-chatbot granularity as the normal-chat results to clarify any differences in tracking exposure.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our measurement study, recognition of its significance as the first systematic empirical evidence on tracking in AI chatbots, and recommendation to accept the manuscript. No major comments were raised, so we have no points requiring response or revision.
Circularity Check
No significant circularity: pure empirical traffic measurement
full rationale
The paper conducts a controlled measurement of network traffic from 20 AI chatbots using sensitive prompts, directly observing payload contents sent to third-party endpoints. No equations, derivations, fitted parameters, or load-bearing self-citations appear in the reported claims. Central results (17/20 chatbots share data; three transmit plaintext snippets to Microsoft Clarity) rest solely on captured traffic observations under the stated lab conditions, with no reduction to any constructed model or prior author work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Captured network requests during controlled sessions represent the complete set of tracking activity performed by the chatbots.
Reference graph
Works this paper leans on
-
[1]
n.d.. Google Maps — google.com. https://www.google.com/maps. [Accessed 27-04-2026]
work page 2026
-
[2]
Intercom | AI Customer Service Platform — intercom.com
n.d.. Intercom | AI Customer Service Platform — intercom.com. https: //www.intercom.com/drlp/ai-customer-service. [Accessed 27-04- 2026]
work page 2026
-
[3]
Microsoft Clarity - Free Heatmaps & Session Recordings — clar- ity.microsoft.com
n.d.. Microsoft Clarity - Free Heatmaps & Session Recordings — clar- ity.microsoft.com. https://clarity.microsoft.com. [Accessed 27-04- 2026]
work page 2026
-
[4]
No, hashing still doesn’t make your data anonymous — ftc.gov
n.d.. No, hashing still doesn’t make your data anonymous — ftc.gov. https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/ 2024/07/no-hashing-still-doesnt-make-your-data-anonymous. [Ac- cessed 26-04-2026]
work page 2024
-
[5]
Temporary Chat FAQ | OpenAI Help Center — help.openai.com
n.d.. Temporary Chat FAQ | OpenAI Help Center — help.openai.com. https://help.openai.com/en/articles/8914046-temporary-chat-faq. [Ac- cessed 28-04-2026]
-
[6]
AIBY. n.d.. ChatOn. https://chaton.ai. Accessed: 2026-04-27
work page 2026
-
[8]
Alibaba Cloud. n.d.. Qwen Chat. https://chat.qwen.ai. Accessed: 2026-04-27
work page 2026
-
[10]
Anthropic. n.d.. Claude. https://claude.ai. Accessed: 2026-04-27
work page 2026
-
[11]
Guendalina Caldarini, Sardar Jaf, and Kenneth McGarry. 2022. A literature survey of recent advances in chatbots.Information13, 1 (2022), 41
work page 2022
-
[12]
Juan-Carlos Carrillo, Jose Luis Martin-Navarro, Rongjun Ma, and Jose Such. 2026. Personal Data Flows and Privacy Policy Traceability in Third-party LLM Apps in the GPT Ecosystem.Proceedings on Privacy Enhancing Technologies(2026)
work page 2026
-
[13]
Character Technologies. n.d.. Character.AI. https://character.ai. Ac- cessed: 2026-04-27
work page 2026
-
[14]
Character.AI. 2026. Privacy Policy. https://web.archive.org/web/ 20260427024931/https://policies.character.ai/privacy. Archived snap- shot 2026-04-27 of https://policies.character.ai/privacy
work page 2026
-
[15]
ChatOn. 2026. Privacy Policy. https://web.archive.org/web/ 20260409165952/https://chaton.ai/legal/privacy/en/. Archived snap- shot 2026-04-09 of https://chaton.ai/legal/privacy/en/
work page 2026
-
[16]
The Conversation. 2026. You probably wouldn’t notice if an AI chatbot slipped ads into its responses. https://theconversation.com/you- probably-wouldnt-notice-if-an-ai-chatbot-slipped-ads-into-its- responses-276010 Accessed: April 2026
work page 2026
-
[18]
DeepSeek. n.d.. DeepSeek. https://chat.deepseek.com. Accessed: 2026-04-27
work page 2026
-
[20]
DuckDuckGo. n.d.. Duck.ai. https://duck.ai. Accessed: 2026-04-27
work page 2026
-
[21]
Jinyan Fan, Tianjun Sun, Jiayi Liu, Teng Zhao, Bo Zhang, Zheng Chen, Melissa Glorioso, and Elissa Hack. 2023. How well can an AI chatbot infer personality? Examining psychometric properties of machine- inferred personality scores.Journal of Applied Psychology108, 8 (2023), 1277
work page 2023
-
[22]
Genspark. 2026. Privacy Policy. https://web.archive.org/web/ 20260405154843/https://www.genspark.ai/privacy. Archived snapshot 2026-04-05 of https://www.genspark.ai/privacy
work page 2026
-
[23]
Genspark. n.d.. Genspark. https://genspark.ai. Accessed: 2026-04-27
work page 2026
-
[24]
Joanna Gerber. 2025. Programmatic Ads Are Coming To AI Chat- bots. https://www.adexchanger.com/publishers/programmatic-ads- are-coming-to-ai-chatbots/
work page 2025
-
[25]
Google. 2025. Gemini adds Temporary Chats and new personaliza- tion features. https://blog.google/products/gemini/temporary-chats- privacy-controls/ Accessed: April 2026
work page 2025
-
[27]
Google. n.d.. Gemini. https://gemini.google.com. Accessed: 2026-04- 27
work page 2026
-
[28]
Ece Gumusel, Kyrie Zhixuan Zhou, and Madelyn Rose Sanfilippo
- [29]
-
[30]
Bowen Jiang, Zhuoqun Hao, Young Min Cho, Bryan Li, Yuan Yuan, Sihao Chen, Lyle Ungar, Camillo Jose Taylor, and Dan Roth. [n. d.]. Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale. InSecond Conference on Language Modeling
- [31]
-
[32]
Lisa Mekioussa Malki, Akhil Polamarasetty, Majid Hatamian, Mark Warner, and Enrico Costanza. 2025. Hoovered up as a data point: Exploring Privacy Behaviours, Awareness, and Concerns Among UK Users of LLM-based Conversational Agents.Proceedings on Privacy Enhancing Technologies(2025)
work page 2025
-
[33]
Manus. 2026. Privacy Policy. https://web.archive.org/web/ 20260425063340/https://manus.im/privacy. Archived snapshot 2026- 04-25 of https://manus.im/privacy
work page 2026
-
[34]
Manus. n.d.. Manus. https://manus.im. Accessed: 2026-04-27
work page 2026
-
[35]
Meta. n.d.. Meta AI. https://meta.ai. Accessed: 2026-04-27
work page 2026
-
[38]
Microsoft. n.d.. About Ads in Copilot. https://help.ads.microsoft.com/ #apex/ads/en/60343/0. [Accessed 27-04-2026]
work page 2026
-
[39]
Microsoft. n.d.. Microsoft Copilot. https://copilot.microsoft.com. Ac- cessed: 2026-04-27
work page 2026
- [40]
-
[41]
Mistral AI. 2026. Privacy Policy. https://web.archive.org/web/ 20260406094100/https://legal.mistral.ai/terms/privacy-policy. Archived snapshot 2026-04-06 of https://legal.mistral.ai/terms/privacy- policy
work page 2026
-
[42]
Mistral AI. n.d.. Mistral. https://mistral.ai. Accessed: 2026-04-27
work page 2026
- [43]
-
[44]
Moonshot AI. n.d.. Kimi. https://kimi.com. Accessed: 2026-04-27
work page 2026
- [46]
-
[47]
OpenAI. 2026. How your data is used to improve model perfor- mance. https://openai.com/policies/how-your-data-is-used-to- improve-model-performance/ Accessed: April 2026
work page 2026
-
[48]
OpenAI. 2026. Memory FAQ. https://help.openai.com/en/articles/ 8590148-memory-faq Accessed: April 2026
work page 2026
- [49]
-
[50]
OpenAI. n.d.. ChatGPT. https://chatgpt.com. Accessed: 2026-04-27
work page 2026
-
[51]
OpenRouter. 2026. Privacy Policy. https://web.archive.org/web/ 20260418174831/https://openrouter.ai/privacy. Archived snapshot 2026-04-18 of https://openrouter.ai/privacy
work page 2026
-
[52]
OpenRouter. n.d.. OpenRouter. https://openrouter.ai. Accessed: 2026-04-27
work page 2026
-
[53]
Perplexity AI. 2026. Privacy Policy. https://web.archive.org/web/ 20260425043522/https://www.perplexity.ai/hub/legal/privacy-policy. Archived snapshot 2026-04-25 of https://www.perplexity.ai/hub/legal/ privacy-policy
work page 2026
-
[54]
Perplexity AI. n.d.. Perplexity. https://perplexity.ai. Accessed: 2026- 04-27
work page 2026
-
[55]
PolyBuzz. 2026. Privacy Policy. https://web.archive.org/web/ 20260409194027/https://www.polybuzz.ai/privacy-policy. Archived snapshot 2026-04-09 of https://www.polybuzz.ai/privacy-policy
work page 2026
-
[56]
PolyBuzz. n.d.. PolyBuzz. https://polybuzz.ai. Accessed: 2026-04-27
work page 2026
-
[57]
Synthia Qia Wang, Sai Teja Peddinti, Nina Taft, and Nick Feamster
-
[58]
InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems
Beyond PII: How Users Attempt to Estimate and Mitigate Implicit LLM Inference. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. 1–17
work page 2026
-
[59]
Quora. 2026. Poe Privacy Policy. https://web.archive.org/web/ 20260427025713/https://poe.com/pages/privacy. Archived snapshot 2026-04-27 of https://poe.com/pages/privacy
work page 2026
-
[60]
Quora. n.d.. Poe. https://poe.com. Accessed: 2026-04-27
work page 2026
-
[61]
Trust Me Over My Privacy Policy
Abdelrahman Ragab, Mohammad Mannan, and Amr Youssef. 2024. “Trust Me Over My Privacy Policy”: Privacy Discrepancies in Romantic AI Chatbot Apps. In2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE, 484–495
work page 2024
-
[62]
Chris Rowlands. 2025. Goodbye Google? People are increas- ingly switching to the likes of ChatGPT, according to major survey. https://www.techradar.com/tech/people-are-increasingly- swapping-google-for-the-likes-of-chatgpt-according-to-a-major- survey-heres-why
work page 2025
-
[63]
SeaArt. 2026. Privacy Policy. https://web.archive.org/web/ 20260425184324/https://node1.cdn2.seaart.ai/mirror/static/ upload/policy.html. Archived snapshot 2026-04-25 of https://node1.cdn2.seaart.ai/mirror/static/upload/policy.html
work page 2026
-
[64]
SeaArt. n.d.. SeaArt. https://seaart.ai. Accessed: 2026-04-27
work page 2026
- [65]
- [66]
-
[67]
Mozilla Support. 2026. Enhanced Tracking Protection in Firefox for desktop. https://support.mozilla.org/en-US/kb/enhanced-tracking- protection-firefox-desktop
work page 2026
-
[68]
Surfshark. 2025. AI Chatbots Ranked by Data They Collect. https: //surfshark.com/research/chart/ai-chatbots-privacy Accessed: April 2026
work page 2025
-
[69]
Jan Tolsdorf, Alan F Luo, Monica Kodwani, Junho Eum, Mahmood Sharif, Michelle L Mazurek, and Adam J Aviv. 2025. Safety Perceptions of Generative {AI} Conversational Agents: Uncovering Perceptual Differences in Trust, Risk, and Fairness. InTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025). 93–112
work page 2025
-
[70]
Yash Vekaria, Aurelio Loris Canino, Jonathan Levitsky, Alex Ciechon- ski, Patricia Callejo, Anna Maria Mandalari, and Zubair Shafiq. 2025. Big Help or Big Brother? Auditing Tracking, Profiling, and Personal- ization in Generative {AI} Assistants. In34th USENIX Security Sym- posium (USENIX Security 25). 8115–8134
work page 2025
-
[71]
Yash Vekaria, Nurullah Demir, Konrad Kollnig, and Zubair Shafiq
-
[72]
Understanding Data Collection, Brokerage, and Spam in the Lead Marketing Ecosystem.arXiv preprint arXiv:2604.06759(2026)
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[73]
WebKit. [n.d.]. Tracking Prevention in WebKit. https://webkit.org/ tracking-prevention/
-
[74]
Yuhao Wu, Evin Jaff, Ke Yang, Ning Zhang, and Umar Iqbal. 2025. An in-depth investigation of data collection in llm app ecosystems. In Proceedings of the 2025 ACM Internet Measurement Conference. 150– 170
work page 2025
-
[75]
xAI. 2026. Privacy Policy. https://web.archive.org/web/ 20260426055921/https://x.ai/legal/privacy-policy. Archived snapshot 2026-04-26 of https://x.ai/legal/privacy-policy
work page 2026
-
[76]
xAI. n.d.. Grok. https://grok.com. Accessed: 2026-04-27. 8 Tracking Conversations
work page 2026
-
[77]
Maxwell Zeff. 2025. Meta plans to sell targeted ads based on data in your AI chats. https://techcrunch.com/2025/10/01/meta-plans-to- sell-targeted-ads-based-on-data-in-your-ai-chats/
work page 2025
-
[78]
Xiao Zhan, Juan Carlos Carrillo, William Seymour, and Jose Such
-
[79]
In34th USENIX Security Symposium (USENIX Security 25)
Malicious {LLM-Based} Conversational {AI} Makes Users Reveal Personal Information. In34th USENIX Security Symposium (USENIX Security 25). 61–80
-
[80]
Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, and Tianshi Li. 2024. “It’s a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–26
work page 2024
-
[81]
A Survey of Large Language Models
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. 2023. A survey of large language models.arXiv preprint arXiv:2303.182231, 2 (2023), 1–124. A Ethical Considerations This work studies the privacy practices of AI chatbot web- sites and does not rely on data collected fro...
work page internal anchor Pith review Pith/arXiv arXiv 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.