Recognition: unknown
Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents
Pith reviewed 2026-05-08 09:08 UTC · model grok-4.3
The pith
LLM agents can reconstruct detailed personal profiles from minimal PII seeds with over 90 percent accuracy at under three dollars in cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
IcebergExplorer uses minimal personally identifiable information as an initial search seed and leverages LLM web access plus reasoning to reconstruct profiles that reach over 90 percent factual accuracy in under 10 minutes for less than three dollars, while the PrivacyIceberg framework divides real-world privacy exposure into explicitly searched, contextually inferred, and deeply aggregated tiers based on the depth of LLM exploitation.
What carries the argument
IcebergExplorer, a tool that starts from minimal PII seeds and applies LLM-driven web searches and reasoning to aggregate and verify profile details across the three tiers of the PrivacyIceberg model.
If this is right
- Platforms fail to address privacy concerns either technically or through policy, creating a gap with public awareness.
- Six root causes drive the observed privacy disclosures in LLM-integrated systems.
- Multi-stakeholder countermeasures are required involving LLM vendors, individuals, and data publishers.
- Privacy risks scale with the sophistication of LLM exploitation from basic searches to deep aggregation.
Where Pith is reading between the lines
- Widespread availability of such tools could normalize low-cost profiling and increase the chilling effect on online behavior.
- Individuals may need to reduce public data footprints more aggressively as LLM agents improve in aggregation speed.
- Regulators could treat LLM agent profiling capabilities as a distinct category when updating data protection rules.
Load-bearing premise
Minimal PII seeds combined with current LLM web-access and reasoning capabilities suffice to produce high-fidelity, generalizable profiles across diverse real-world individuals without substantial additional data or platform cooperation.
What would settle it
A controlled test on a diverse set of 100 real-world individuals where the method achieves factual accuracy below 70 percent or requires costs above 10 dollars on average.
Figures
read the original abstract
Large Language Models (LLMs) have revolutionized how information are collected, aggregated, and reasoned. However, this enables a novel and accessible vector of privacy intrusion: the automated and in-depth personal profiling; this engenders a chilling effect of "peepers everywhere". Existing research primarily unfolds from the training pipeline of LLM, emphasizing the exposure of Personally Identifiable Information (PII) through memorization, while privacy studies from a human-centric perspective remain underexplored. To fill this void, we empirically investigate privacy perception in the real world through the lens of human awareness and the practices of LLM-integrated platforms, revealing a significant dissonance: platforms fail to technically or policy-wise address public privacy concerns. To facilitate a systematic and quantifiable study of privacy risk, we propose the PrivacyIceberg, which categorizes real-world human privacy risks into three tiers: explicitly searched, contextually inferred, and deeply aggregated, based on the sophistication of LLM exploitation. We developed IcebergExplorer to audit privacy exposure, utilizing minimal PII as a search seed to reconstruct high-fidelity profiles, achieving over 90% factual accuracy within 10 minutes at a cost under $3, for real-world scenarios. Additionally, we identify six root causes contributing to such privacy disclosures and propose multi-stakeholder countermeasures for LLM vendors, individuals, and data publishers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper empirically investigates privacy risks from LLM agents, highlighting a gap between public concerns and platform practices. It introduces the PrivacyIceberg framework categorizing risks into explicitly searched, contextually inferred, and deeply aggregated tiers, and presents IcebergExplorer, a tool that reconstructs high-fidelity personal profiles from minimal PII seeds, claiming over 90% factual accuracy within 10 minutes at under $3 cost in real-world scenarios. It identifies six root causes of disclosures and proposes multi-stakeholder countermeasures.
Significance. If the empirical claims hold under rigorous validation, the work provides a concrete, low-cost demonstration of accessible profiling risks enabled by current LLM web-access and reasoning capabilities. This could usefully inform discussions on LLM platform responsibilities, user awareness, and data publisher practices, particularly by quantifying practical attack surfaces that prior memorization-focused studies have not emphasized.
major comments (2)
- [IcebergExplorer evaluation] The headline claim of >90% factual accuracy for IcebergExplorer-reconstructed profiles (abstract and IcebergExplorer section) lacks any description of evaluation methodology, including test subject count and diversity, sources of independent ground-truth facts, controls for selection bias, or whether accuracy was measured via external verification rather than LLM self-scoring or limited author inspection. This is load-bearing for the central empirical result.
- [Empirical investigation of privacy perception] The reported dissonance between human privacy awareness and LLM platform practices (introduction and empirical investigation sections) is presented without details on survey or data-collection methods, sample sizes, or analysis approach, making it impossible to assess the strength of this supporting observation.
minor comments (2)
- [Abstract] The abstract states that six root causes are identified but does not enumerate them; including a brief list or table would improve readability and allow readers to connect them to the proposed countermeasures.
- [PrivacyIceberg framework] The three tiers of the PrivacyIceberg are introduced conceptually but would benefit from a clarifying diagram or concrete examples to distinguish 'contextually inferred' from 'deeply aggregated' risks.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help improve the clarity and rigor of our empirical findings. We address each major comment below and have made revisions to incorporate the suggested details.
read point-by-point responses
-
Referee: [IcebergExplorer evaluation] The headline claim of >90% factual accuracy for IcebergExplorer-reconstructed profiles (abstract and IcebergExplorer section) lacks any description of evaluation methodology, including test subject count and diversity, sources of independent ground-truth facts, controls for selection bias, or whether accuracy was measured via external verification rather than LLM self-scoring or limited author inspection. This is load-bearing for the central empirical result.
Authors: We acknowledge that the original manuscript did not provide sufficient details on the evaluation methodology for the accuracy claim. This was an oversight in the presentation. In the revised version, we have expanded the IcebergExplorer section with a dedicated 'Evaluation Setup' subsection. It now includes the number of test subjects and their diversity, the sources used for independent ground-truth facts (such as public records and verified self-reports), measures taken to control for selection bias, and confirmation that accuracy was assessed through external verification by independent reviewers rather than LLM self-assessment. We believe this addresses the concern and strengthens the central result. revision: yes
-
Referee: [Empirical investigation of privacy perception] The reported dissonance between human privacy awareness and LLM platform practices (introduction and empirical investigation sections) is presented without details on survey or data-collection methods, sample sizes, or analysis approach, making it impossible to assess the strength of this supporting observation.
Authors: We agree that the methods for the empirical investigation of privacy perception were not described in adequate detail. The revised manuscript now includes an 'Empirical Investigation Methodology' subsection in the relevant section. This details the survey design, sample size and recruitment approach, data collection procedures, and the analysis methods used to identify the dissonance between public concerns and platform practices. These additions allow for proper evaluation of the supporting observation. revision: yes
Circularity Check
No significant circularity: empirical demonstration without derivation chain
full rationale
The paper is an empirical study proposing the PrivacyIceberg categorization and IcebergExplorer tool for auditing LLM privacy risks. It reports an observed >90% factual accuracy from minimal PII seeds in real-world scenarios but contains no equations, fitted parameters, predictions, or self-citations that reduce the central claims to inputs by construction. The accuracy metric is presented as a direct experimental outcome rather than a self-referential or fitted result. No load-bearing steps rely on renaming known results, smuggling ansatzes, or uniqueness theorems from prior self-work. The work is therefore self-contained as a tool proposal and demonstration.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM-integrated platforms fail to technically or policy-wise address public privacy concerns
- domain assumption Minimal PII seeds enable reconstruction of high-fidelity profiles via LLM reasoning over web data
invented entities (2)
-
PrivacyIceberg
no independent evidence
-
IcebergExplorer
no independent evidence
Reference graph
Works this paper leans on
-
[1]
https://gdpr-info.eu/art-4-gdpr/, 2016
General Data Protection Regulation (GDPR), Article 4. https://gdpr-info.eu/art-4-gdpr/, 2016
2016
-
[2]
https:// oag.ca.gov/privacy/ccpa, 2018
California Consumer Privacy Act (CCPA). https:// oag.ca.gov/privacy/ccpa, 2018
2018
-
[3]
Netizens claim their wechat ids were searched by a stranger using doubao, lawyer: possibly infringing on rights
AIGCDaily. Netizens claim their wechat ids were searched by a stranger using doubao, lawyer: possibly infringing on rights. https://www.aigcdaily.cn/n ews/a24qmwowb6jx7d6/, 2024
2024
-
[4]
The moment i found my student number, my heart stopped
AITNTNews. The moment i found my student number, my heart stopped. https://www.aitntnews.com/ newDetail.html?newId=9702/, 2024
2024
-
[5]
Social analyzer: An osint tool for analyzing and correlating profiles across social media platforms
Mohammad Alaa and Contributors. Social analyzer: An osint tool for analyzing and correlating profiles across social media platforms. https://github.com/qee qbox/social-analyzer , 2020. Accessed: Oct. 20, 2025
2020
-
[6]
The menlo report.IEEE Security & Privacy, 10(2):71–75, 2012
Michael Bailey, David Dittrich, Erin Kenneally, and Doug Maughan. The menlo report.IEEE Security & Privacy, 10(2):71–75, 2012
2012
-
[7]
Extracting training data from large language models
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. InUSENIX Security Sym- posium, 2021
2021
-
[8]
Kaiyuan Chen, Yixin Ren, Yang Liu, Xiaobo Hu, Hao- tong Tian, Tianbao Xie, Fangfu Liu, Haoye Zhang, Hongzhang Liu, Yuan Gong, et al. xbench: Tracking 14 agents productivity scaling with profession-aligned real- world evaluations.arXiv preprint arXiv:2506.13651, 2025
-
[9]
Graph unlearning
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. Graph unlearning. InProceedings of the 2022 ACM SIGSAC conference on computer and communications security, pages 499–513, 2022
2022
-
[10]
The janus interface: How fine- tuning in large language models amplifies the privacy risks
Xiaoyi Chen, Siyuan Tang, Rui Zhu, Shijun Yan, Lei Jin, Zihao Wang, Liya Su, Zhikun Zhang, XiaoFeng Wang, and Haixu Tang. The janus interface: How fine- tuning in large language models amplifies the privacy risks. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 1285–1299, 2024
2024
-
[11]
Scrub it out! erasing sensitive memoriza- tion in code language models via machine unlearning
Zhaoyang Chu, Yao Wan, Zhikun Zhang, Di Wang, Zhou Yang, Hongyu Zhang, Pan Zhou, Xuanhua Shi, Hai Jin, and David Lo. Scrub it out! erasing sensitive memoriza- tion in code language models via machine unlearning. arXiv preprint arXiv:2509.13755, 2025
-
[12]
Automated Profile Inference with Language Model Agents
Yuntao Du, Zitao Li, Bolin Ding, Yaliang Li, Hanshen Xiao, Jingren Zhou, and Ninghui Li. Automated profile inference with language model agents.arXiv preprint arXiv:2505.12402, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[13]
Automated profile inference with language model agents.arXiv preprint,
Yuntao Du, Zitao Li, Bolin Ding, Yaliang Li, Hanshen Xiao, Jingren Zhou, and Ninghui Li. Automated profile inference with language model agents.arXiv preprint,
-
[14]
Beyond data privacy: New privacy risks for large language models,
Yuntao Du, Zitao Li, Ninghui Li, and Bolin Ding. Be- yond data privacy: New privacy risks for large language models.arXiv preprint arXiv:2509.14278, 2025
-
[15]
Top 35 social media platforms
Fabio Duarte. Top 35 social media platforms. https: //explodingtopics.com/blog/top-social-med ia-platforms#top-35-most-popular-social-m edia-websites, 2026
2026
-
[16]
On the privacy risks of cell- based nas architectures
Hai Huang, Zhikun Zhang, Yun Shen, Michael Backes, Qi Li, and Yang Zhang. On the privacy risks of cell- based nas architectures. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communi- cations Security, pages 1427–1441, 2022
2022
-
[17]
Trustllm: Trustworthiness in large language models
Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qi- hui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wen- han Lyu, Yixuan Zhang, et al. Trustllm: Trustwor- thiness in large language models.arXiv preprint arXiv:2401.05561, 2024
-
[18]
When {LLMs} go online: The emerging threat of {Web-Enabled}{LLMs}
Hanna Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin, and Kimin Lee. When {LLMs} go online: The emerging threat of {Web-Enabled}{LLMs}. In34th USENIX Security Symposium (USENIX Security 25), pages 1729–1748, 2025
2025
-
[19]
Ethi- cal frameworks and computer security trolley problems: Foundations for conversations
Tadayoshi Kohno, Yasemin Acar, and Wulf Loh. Ethi- cal frameworks and computer security trolley problems: Foundations for conversations. In32nd USENIX Se- curity Symposium (USENIX Security 23), pages 5145– 5162, 2023
2023
-
[20]
Beyond gdpr: Unauthorized reidentifi- cation and the mosaic effect in the eu ai act
Gary LaFever. Beyond gdpr: Unauthorized reidentifi- cation and the mosaic effect in the eu ai act. https: //iapp.org/news/a/beyond-gdpr-unauthorize d-reidentification-and-the-mosaic-effec t-in-the-eu-ai-act/, 2023
2023
-
[21]
Formalizing and benchmarking prompt injection attacks and defenses
Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. Formalizing and benchmarking prompt injection attacks and defenses. In33rd USENIX Security Symposium (USENIX Security 24), pages 1831– 1847, 2024
2024
-
[22]
Evaluating LLM-based Personal Information Extraction and Countermeasures
Yupei Liu, Yuqi Jia, Jinyuan Jia, and Neil Zhenqiang Gong. Evaluating llm-based personal information ex- traction and countermeasures. InUSENIX Security Sym- posium (to appear), 2025. arXiv:2408.07291
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[23]
This prompt can make an ai chatbot identify and extract personal details from your chats
WIRED Matt Burgess. This prompt can make an ai chatbot identify and extract personal details from your chats. https://www.wired.com/story/ai-impro mpter-malware-llm/, 2024
2024
-
[24]
Senior Technology & Science Reporter Millie Turner. Meta watch out! modified meta ai glasses used to ‘reveal anyone’s personal info’ in seconds just by looking at them. https://www.thesun.ie/tech/13939493/ meta-ai-ray-ban-glasses-reveal-personal-i nformation/, 2024
-
[25]
Netizens said that they were used ai to search for a wechat account, and the person in charge responded
Jiupai News. Netizens said that they were used ai to search for a wechat account, and the person in charge responded. https://news.qq.com/rain/a/20241 211A052NQ00/, 2024
2024
-
[26]
Privacy as contextual integrity
Helen Nissenbaum. Privacy as contextual integrity. Wash. L. Rev., 79:119, 2004
2004
-
[27]
Exploring llm-based agents for root cause analysis
Devjeet Roy, Xuchao Zhang, Rashi Bhave, Chetan Bansal, Pedro Las-Casas, Rodrigo Fonseca, and Saravan Rajmohan. Exploring llm-based agents for root cause analysis. InCompanion proceedings of the 32nd ACM international conference on the foundations of software engineering, pages 208–219, 2024
2024
-
[28]
Sofia Eleni Spatharioti, David M Rothschild, Daniel G Goldstein, and Jake M Hofman. Comparing traditional and llm-based search for consumer choice: A random- ized experiment.arXiv preprint arXiv:2307.03744, 2023. 15
-
[29]
Robin Staab et al. Beyond memorization: Violating privacy via inference with large language models.arXiv preprint arXiv:2310.07298, 2023
-
[30]
Robin Staab, Mark Vero, Mislav Balunovic, and Mar- tin T. Vechev. Beyond memorization: Violating privacy via inference with large language models. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenRe- view.net, 2024
2024
-
[31]
Large language models for data annotation and synthesis: A survey,
Zhen Tan, Dawei Li, Song Wang, Alimohammad Beigi, Bohan Jiang, Amrita Bhattacharjee, Mansooreh Karami, Jundong Li, Lu Cheng, and Huan Liu. Large language models for data annotation and synthesis: A survey. arXiv preprint arXiv:2402.13446, 2024
-
[32]
Batuhan Tömekçe, Mark Vero, Robin Staab, and Mar- tin T. Vechev. Private attribute inference from images with vision-language models. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, and Cheng Zhang, editors, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Proc...
2024
-
[33]
Rigging the foundation: Manipulating pre-training for advanced membership inference attacks
Zihao Wang, Rui Zhu, Zhikun Zhang, Haixu Tang, and XiaoFeng Wang. Rigging the foundation: Manipulating pre-training for advanced membership inference attacks. In2025 IEEE Symposium on Security and Privacy (SP), pages 2509–2526. IEEE, 2025
2025
-
[34]
Mosaic effect
Wikipedia. Mosaic effect. https://en.wikipedia .org/wiki/Mosaic_effect, 2025
2025
-
[35]
Ai chatbots can guess your personal information from what you type
WIRED Will Knight. Ai chatbots can guess your personal information from what you type. https: //www.wired.com/story/ai-chatbots-can-g uess-your-personal-information/, 2023
2023
-
[36]
Leveraging large language models to enhance personalized recommenda- tions in e-commerce
Wei Xu, Jue Xiao, and Jianlong Chen. Leveraging large language models to enhance personalized recommenda- tions in e-commerce. In2024 International Conference on Electrical, Communication and Computer Engineer- ing (ICECCE), pages 1–6. IEEE, 2024
2024
-
[37]
Hanna Yukhymenko, Robin Staab, Mark Vero, and Mar- tin T. Vechev. A synthetic dataset for personal attribute inference. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tom- czak, and Cheng Zhang, editors,Advances in Neural Information Processing Systems 38: Annual Confer- ence on Neural Information Processing Syst...
-
[38]
Perception of Privacy Violation:Public’semotional responsesto this data exposure and thereasonsbehind
-
[39]
Doxing and Mining Risk Awareness:User awareness andfirsthand experienceswith the risks of “doxing” themselves or others, including theirunderstandingof how modernLLM toolscan be used for privacy mining
-
[40]
Resulting Harm from Data Exposure:What spectrum of actualharmsare cited by users
-
[41]
helpless
Expectations for Protection and Responsibility:Pub- lic’sexpectationsfor privacy safeguards and theirattri- butionof responsibility. We use LLMs (Appendix E) to analyze each comment (text and meme) to obtain the keywords for the four topics and filter out the topic-irrelevant words and visualize them. To handle “memes” of the platforms, we utilized a the ...
-
[42]
privacy modes
DRP: Data Retention Policy. 5) Policy: Existence of a public AI privacy policy. 6) User Control: Opt = opt-out, Del = delete/view history, Sw = dedicated switch. 7) Output: PII = synthesized personal information in outputs; Src = outputs cite sources. 8) Comp.: Whether public user complaints exist regarding AI privacy issues. 9) Reg. Scr.: Whether the pla...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.