pith. machine review for the scientific record. sign in

arxiv: 2605.12147 · v1 · submitted 2026-05-12 · 💻 cs.CR · cs.LG

Recognition: 2 theorem links

· Lean Theorem

PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

Authors on Pith no claims yet

Pith reviewed 2026-05-13 04:51 UTC · model grok-4.3

classification 💻 cs.CR cs.LG
keywords LLM simulationprivacy behavioruser personasevaluation benchmarkdata sharingindividual decisionsprivacy attitudes
0
0 comments X

The pith

Conditioning LLMs on user personas improves simulation of individual privacy decisions yet tops out at 40.4 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PrivacySIM to test whether frontier LLMs can reproduce the privacy choices of specific people when given information about their demographics, prior experiences, and stated attitudes. It pits nine models against ground-truth answers from one thousand participants drawn from five published studies covering healthcare consultations, conversational agents, and chatbots. Adding persona details raises match rates above the no-persona baseline, but the best model still reaches only 40.4 percent agreement. Stated attitudes alone prove weak predictors because they frequently diverge from actual behavior, and users with high AI experience but low privacy concern are especially hard to simulate. The benchmark is released to support further work on making LLMs better stand-ins for individual privacy decisions.

Core claim

Conditioning nine frontier LLMs on subsets of three persona facets—demographics, previous experiences, and stated privacy attitudes—consistently raises the rate at which their responses to data-sharing scenarios match the ground-truth answers of 1,000 users from five published studies, yet the strongest model achieves only 40.4 percent accuracy and users with high AI experience but low stated privacy attitudes remain the hardest to simulate.

What carries the argument

The PrivacySIM evaluation suite, which conditions LLMs on subsets of persona facets and scores how often each model's output matches an individual user's recorded response to a privacy scenario.

Load-bearing premise

The responses collected in the five published user studies accurately represent each participant's real privacy behavior in the scenarios.

What would settle it

Collect fresh privacy decisions from the same 1,000 participants on the identical scenarios after a delay of months and measure how many original ground-truth answers no longer match the new responses.

Figures

Figures reproduced from arXiv: 2605.12147 by James Flemings, Murali Annavaram.

Figure 1
Figure 1. Figure 1: Overview of PRIVACYSIM, an evaluation suite for simulating user privacy behavior. We collect user responses and questionnaires from existing user studies on privacy behavior in LLM and AI contexts. We then condition LLMs on subsets of a user’s privacy persona (demographics, previous experiences with LLMs, and stated privacy attitudes) to simulate their responses to data-sharing questions. Finally, we evalu… view at source ↗
Figure 2
Figure 2. Figure 2: Average accuracy by prompt type and model across user studies. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Per-domain accuracy with two averages (solid black: 5-study average; dashed black: 3-study [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Per-domain tolerance accuracy across all eight prompt types for every model evaluated in [PITH_FULL_IMAGE:figures/full_fig_p031_4.png] view at source ↗
read the original abstract

Large language models (LLMs) are increasingly used to simulate human behavior, but their ability to simulate $individual$ privacy decisions is not well understood. In this paper, we address the problem of evaluating whether a core set of user persona attributes can drive LLMs to simulate individual-level privacy behavior. We introduce PrivacySIM, an evaluation suite that benchmarks LLM simulation of user privacy behavior against the ground-truth responses of 1,000 users. These users are drawn from five published user studies on privacy spanning LLM healthcare consultations, conversational agents, and chatbots. Drawing on these user studies, we hypothesize three persona facets as plausible predictors of privacy decision-making: demographics, previous experiences, and stated privacy attitudes. We condition nine frontier LLMs on subsets of these three facets and measure how often each model's response to a data-sharing scenario matches the user's actual response. Our findings show that (1) privacy persona conditioning consistently improves simulation quality over no-persona conditioning, but even the strongest model (40.4\% accuracy) remains far from faithfully simulating individual privacy decisions. (2) A user's stated privacy attitudes alone may not be the best predictor because they often diverge from the user's actual privacy behavior. (3) Users with high AI/chatbot experience but low stated privacy attitudes are the most challenging to simulate. PrivacySIM is a first step toward understanding and improving the capabilities of LLMs to simulate user privacy decisions. We release PrivacySIM to enable further evaluation of LLM privacy simulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces PrivacySIM, a benchmark suite for evaluating LLMs' ability to simulate individual-level user privacy decisions. It draws on ground-truth responses from 1,000 users across five published studies involving data-sharing scenarios in healthcare, conversational agents, and chatbots. The authors posit three persona facets (demographics, prior experiences, and stated privacy attitudes) as drivers of decisions, condition nine frontier LLMs on subsets of these facets, and measure how often model outputs match the users' reported choices. Results indicate that persona conditioning improves simulation accuracy over baselines, with the best model reaching 40.4% accuracy, while also noting that attitudes alone are weak predictors and that high-AI-experience/low-attitude users are hardest to simulate. The benchmark is released publicly.

Significance. If the empirical results hold under scrutiny, the work provides a concrete, reusable benchmark for assessing LLM simulation of privacy behavior, an increasingly relevant capability for user modeling and privacy research. The release of PrivacySIM and the identification of specific conditioning effects and failure modes are strengths that enable follow-on work. The paper is measured in its claims, avoiding overstatement of current LLM fidelity.

major comments (3)
  1. [§3] §3 (User Studies and Ground Truth): The central evaluation equates matching LLM outputs to survey responses on hypothetical vignettes with 'simulation of individual privacy behavior.' However, the source studies collect stated intentions rather than observed decisions; the manuscript should explicitly discuss the privacy paradox and social-desirability bias as threats to interpreting the 40.4% ceiling as evidence of (or distance from) faithful simulation of underlying decision processes.
  2. [§4.3] §4.3 (Evaluation Metrics): The exact procedure for determining a 'match' between an LLM response and a user's ground-truth answer is not fully specified (e.g., exact string match, semantic similarity threshold, or LLM judge). This definition is load-bearing for all reported accuracy figures and the claim that persona conditioning 'consistently improves' quality.
  3. [§5] §5 (Results and Analysis): No statistical tests (p-values, confidence intervals, or effect sizes) are reported for the accuracy differences between conditioning regimes. Without these, it is difficult to assess whether the observed improvements over no-persona baselines are reliable or could be explained by variance across the 1,000 users or five studies.
minor comments (2)
  1. [Abstract and §1] The abstract and §1 state that 'privacy persona conditioning consistently improves simulation quality,' but the precise subsets of facets tested (e.g., demographics only vs. all three) and their per-model breakdowns should be tabulated for clarity.
  2. [Figures] Figure captions and legends could more explicitly label the conditioning conditions (none, demographics, experiences, attitudes, full persona) to aid quick reading of the accuracy plots.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for their constructive comments, which have helped us identify areas for improvement in the manuscript. We have addressed each major comment by planning revisions to enhance the discussion of limitations, clarify the evaluation methodology, and add statistical analysis. These changes will strengthen the paper's contributions and transparency.

read point-by-point responses
  1. Referee: [§3] §3 (User Studies and Ground Truth): The central evaluation equates matching LLM outputs to survey responses on hypothetical vignettes with 'simulation of individual privacy behavior.' However, the source studies collect stated intentions rather than observed decisions; the manuscript should explicitly discuss the privacy paradox and social-desirability bias as threats to interpreting the 40.4% ceiling as evidence of (or distance from) faithful simulation of underlying decision processes.

    Authors: We thank the referee for highlighting this important distinction. The ground-truth data indeed consists of stated responses to hypothetical scenarios from published user studies, which are subject to the privacy paradox (where stated attitudes differ from actual behavior) and potential social-desirability bias. We will revise §3 to explicitly discuss these limitations and their implications for interpreting the simulation accuracy results. This will clarify that the benchmark evaluates alignment with stated intentions rather than unobserved real-world decisions. revision: yes

  2. Referee: [§4.3] §4.3 (Evaluation Metrics): The exact procedure for determining a 'match' between an LLM response and a user's ground-truth answer is not fully specified (e.g., exact string match, semantic similarity threshold, or LLM judge). This definition is load-bearing for all reported accuracy figures and the claim that persona conditioning 'consistently improves' quality.

    Authors: We agree that the matching procedure requires more precise specification. In the revised manuscript, we will expand §4.3 to detail the exact method used for determining matches, including any parsing of responses, similarity metrics if applicable, or use of automated judges. This will ensure reproducibility and allow readers to assess the robustness of the accuracy figures. revision: yes

  3. Referee: [§5] §5 (Results and Analysis): No statistical tests (p-values, confidence intervals, or effect sizes) are reported for the accuracy differences between conditioning regimes. Without these, it is difficult to assess whether the observed improvements over no-persona baselines are reliable or could be explained by variance across the 1,000 users or five studies.

    Authors: We appreciate this feedback on the statistical rigor of our analysis. We will incorporate appropriate statistical tests in §5, such as paired t-tests or Wilcoxon signed-rank tests across users for the accuracy differences, along with confidence intervals and effect sizes. This will provide evidence for the reliability of the improvements observed with persona conditioning. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical comparison to external ground-truth data

full rationale

The paper evaluates LLM simulation quality via direct match rates between model outputs and user responses drawn from five independent published studies (1,000 users total). No equations, fitted parameters, or derivations are present that reduce the reported accuracy (e.g., 40.4%) to a self-referential definition or input by construction. Persona facets serve as experimental conditioning variables whose effects are measured against the external benchmark, keeping the chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The evaluation rests on the domain assumption that the selected persona facets drive privacy decisions and that the published studies supply valid ground truth; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Demographics, previous experiences, and stated privacy attitudes are plausible predictors of individual privacy decision-making
    Explicitly hypothesized in the abstract as the basis for conditioning the LLMs.

pith-pipeline@v0.9.0 · 5560 in / 1213 out tokens · 70654 ms · 2026-05-13T04:51:36.754543+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 3 internal anchors

  1. [1]

    Privacy norms for smart home personal assistants

    Noura Abdi, Xiao Zhan, Kopo M Ramokapane, and Jose Such. Privacy norms for smart home personal assistants. InProceedings of the 2021 CHI conference on human factors in computing systems, pages 1–14, 2021

  2. [2]

    Privacy in electronic commerce and the economics of immediate grat- ification

    Alessandro Acquisti. Privacy in electronic commerce and the economics of immediate grat- ification. InProceedings of the 5th ACM conference on Electronic commerce, pages 21–29, 2004

  3. [3]

    Privacy and rationality in individual decision making

    Alessandro Acquisti and Jens Grossklags. Privacy and rationality in individual decision making. IEEE Security & Privacy, 3(1):26–33, 2005

  4. [4]

    Is there a cost to privacy breaches? an event study.ICIS 2006 proceedings, page 94, 2006

    Alessandro Acquisti, Allan Friedman, and Rahul Telang. Is there a cost to privacy breaches? an event study.ICIS 2006 proceedings, page 94, 2006

  5. [5]

    Using large language models to simulate multiple humans and replicate human subject studies

    Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. Using large language models to simulate multiple humans and replicate human subject studies. InInternational conference on machine learning, pages 337–371. PMLR, 2023

  6. [6]

    Noah Apthorpe, Yan Shvartzshnaider, Arunesh Mathur, Dillon Reisman, and Nick Feamster. Discovering smart home internet of things privacy norms using contextual integrity.Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies, 2(2):1–23, 2018

  7. [7]

    Out of one, many: Using language models to simulate human samples.Political Analysis, 31(3):337–351, 2023

    Lisa P Argyle, Ethan C Busby, Nancy Fulda, Joshua R Gubler, Christopher Rytting, and David Wingate. Out of one, many: Using language models to simulate human samples.Political Analysis, 31(3):337–351, 2023

  8. [8]

    Clinton, Cassy Dorff, Brenton Kenkel, and Jennifer M

    James Bisbee, Joshua D. Clinton, Cassy Dorff, Brenton Kenkel, and Jennifer M. Larson. Synthetic replacements for human survey data? the perils of large language models.Political Analysis, 32(4):401–416, 2024. doi: 10.1017/pan.2024.5

  9. [9]

    The role of privacy fatigue in online privacy behavior.Computers in Human Behavior, 81:42–51, 2018

    Hanbyul Choi, Jonghwa Park, and Yoonhyuk Jung. The role of privacy fatigue in online privacy behavior.Computers in Human Behavior, 81:42–51, 2018

  10. [10]

    Bot among us: Exploring user awareness and privacy concerns about chatbots in group chats

    Kai-Hsiang Chou, Yi-An Wang, Chong Kai Lau, Mahmood Sharif, and Hsu-Chun Hsiao. Bot among us: Exploring user awareness and privacy concerns about chatbots in group chats. Proceedings on Privacy Enhancing Technologies, 2026

  11. [11]

    Information privacy concerns, procedural fairness, and impersonal trust: An empirical investigation.Organization science, 10(1):104–115, 1999

    Mary J Culnan and Pamela K Armstrong. Information privacy concerns, procedural fairness, and impersonal trust: An empirical investigation.Organization science, 10(1):104–115, 1999

  12. [12]

    Questioning the survey responses of large language models

    Ricardo Dominguez-Olmedo, Moritz Hardt, and Celestine Mendler-Dünner. Questioning the survey responses of large language models. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

  13. [13]

    Privacy personas: Clustering users via attitudes and behaviors toward security practices

    Janna Lynn Dupree, Richard Devries, Daniel M Berry, and Edward Lank. Privacy personas: Clustering users via attitudes and behaviors toward security practices. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages 5228–5239, 2016

  14. [14]

    Text-Based Personas for Simulating User Privacy Decisions

    Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal, Hamza Harkous, Nina Taft, and Marco Gruteser. Text-based personas for simulating user privacy decisions.arXiv preprint arXiv:2603.19791, 2026

  15. [15]

    Android permissions: User attention, comprehension, and behavior

    Adrienne Porter Felt, Elizabeth Ha, Serge Egelman, Ariel Haney, Erika Chin, and David Wagner. Android permissions: User attention, comprehension, and behavior. InProceedings of the eighth symposium on usable privacy and security, pages 1–14, 2012

  16. [16]

    Personalizing agent privacy decisions via logical entailment.arXiv preprint arXiv:2512.05065, 2025

    James Flemings, Ren Yi, Octavian Suciu, Kassem Fawaz, Murali Annavaram, and Marco Gruteser. Personalizing agent privacy decisions via logical entailment.arXiv preprint arXiv:2512.05065, 2025

  17. [17]

    SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

    Tiancheng Hu, Joachim Baumann, Lorenzo Lupo, Nigel Collier, Dirk Hovy, and Paul Röttger. Simbench: Benchmarking the ability of large language models to simulate human behaviors. arXiv preprint arXiv:2510.17516, 2025. 10

  18. [18]

    Aligning language models to user opinions

    EunJeong Hwang, Bodhisattwa Majumder, and Niket Tandon. Aligning language models to user opinions. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 5906–5919, 2023

  19. [19]

    Improv- ing language model personas via rationalization with psychological scaffolds

    Brihi Joshi, Xiang Ren, Swabha Swayamdipta, Rik Koncel-Kedziorski, and Tim Paek. Improv- ing language model personas via rationalization with psychological scaffolds. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Suzhou, China. Association for Computational Linguistics, 2025

  20. [20]

    Privacy attitudes and privacy behaviour: A review of current research on the privacy paradox phenomenon.Computers & Security, 64:122–134, 2017

    Spyros Kokolakis. Privacy attitudes and privacy behaviour: A review of current research on the privacy paradox phenomenon.Computers & Security, 64:122–134, 2017

  21. [21]

    Privacy indexes: a survey of westin’s studies

    Ponnurangam Kumaraguru and Lorrie Faith Cranor. Privacy indexes: a survey of westin’s studies. 2005

  22. [22]

    Gonzalez, Hao Zhang, and Ion Stoica

    Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large lan- guage model serving with pagedattention. InProceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023

  23. [23]

    How well can llm agents simulate end-user security and privacy attitudes and behaviors?arXiv preprint arXiv:2602.18464, 2026

    Yuxuan Li, Leyang Li, Hao-Ping Lee, and Sauvik Das. How well can llm agents simulate end-user security and privacy attitudes and behaviors?arXiv preprint arXiv:2602.18464, 2026

  24. [24]

    Prevalence over- shadows concerns? understanding chinese users’ privacy awareness and expectations towards llm-based healthcare consultation

    Zhihuang Liu, Ling Hu, Tongqing Zhou, Yonghao Tang, and Zhiping Cai. Prevalence over- shadows concerns? understanding chinese users’ privacy awareness and expectations towards llm-based healthcare consultation. In2025 IEEE Symposium on Security and Privacy (SP), pages 2716–2734. IEEE, 2025

  25. [25]

    Lisa Mekioussa Malki, Akhil Polamarasetty, Majid Hatamian, Mark Warner, and Enrico Costanza. Hoovered up as a data point: Exploring privacy behaviours, awareness, and con- cerns among uk users of llm-based conversational agents.Proceedings on Privacy Enhancing Technologies, 2025

  26. [26]

    Measuring privacy: An empirical test using context to expose confounding variables.Columbia Science & Technology Law Review, 18:176, 2016

    Kirsten Martin and Helen Nissenbaum. Measuring privacy: An empirical test using context to expose confounding variables.Columbia Science & Technology Law Review, 18:176, 2016

  27. [27]

    Can llms keep a secret? testing privacy implications of language models via contextual integrity theory

    Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, and Yejin Choi. Can llms keep a secret? testing privacy implications of language models via contextual integrity theory. InThe Twelfth International Conference on Learning Representations, 2024

  28. [28]

    Privacy as contextual integrity.Washington Law Review, 79:119, 2004

    Helen Nissenbaum. Privacy as contextual integrity.Washington Law Review, 79:119, 2004

  29. [29]

    The privacy paradox: Personal information disclosure intentions versus behaviors.Journal of consumer affairs, 41(1):100–126, 2007

    Patricia A Norberg, Daniel R Horne, and David A Horne. The privacy paradox: Personal information disclosure intentions versus behaviors.Journal of consumer affairs, 41(1):100–126, 2007

  30. [30]

    Nvidia nemotron 3: Efficient and open intelligence, 2025

    NVIDIA. Nvidia nemotron 3: Efficient and open intelligence, 2025. URL https://arxiv. org/abs/2512.20856. White Paper

  31. [31]

    Generative agents: Interactive simulacra of human behavior

    Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. InProceed- ings of the 36th annual acm symposium on user interface software and technology, pages 1–22, 2023

  32. [32]

    LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

    Joon Sung Park, Carolyn Q Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, and Michael S Bernstein. Generative agent simulations of 1,000 people.arXiv preprint arXiv:2411.10109, 2024

  33. [33]

    Qwen3.5: Towards native multimodal agents, February 2026

    Qwen Team. Qwen3.5: Towards native multimodal agents, February 2026. URL https: //qwen.ai/blog?id=qwen3.5

  34. [34]

    A protection motivation theory of fear appeals and attitude change1.The journal of psychology, 91(1):93–114, 1975

    Ronald W Rogers. A protection motivation theory of fear appeals and attitude change1.The journal of psychology, 91(1):93–114, 1975. 11

  35. [35]

    Whose opinions do language models reflect? InProceedings of the 40th In- ternational Conference on Machine Learning (ICML), 2023

    Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. Whose opinions do language models reflect? InProceedings of the 40th In- ternational Conference on Machine Learning (ICML), 2023

  36. [36]

    Privacylens: Evaluating privacy norm awareness of language models in action.Advances in Neural Information Processing Systems, 37:89373–89407, 2024

    Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, and Diyi Yang. Privacylens: Evaluating privacy norm awareness of language models in action.Advances in Neural Information Processing Systems, 37:89373–89407, 2024

  37. [37]

    Understanding privacy norms around llm-based chatbots: A contextual integrity perspective

    Sarah Tran, Hongfan Lu, Isaac Slaughter, Bernease Herman, Aayushi Dangol, Yue Fu, Lufei Chen, Biniyam Gebreyohannes, Bill Howe, Alexis Hiniker, et al. Understanding privacy norms around llm-based chatbots: A contextual integrity perspective. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 8, pages 2522–2534, 2025

  38. [38]

    Replication Data for: Understanding Privacy Norms Around LLM-Based Chatbots: A Contextual Integrity Perspective, 2025

    Sarah Tran, Robert Wolfe, and Nicholas Weber. Replication Data for: Understanding Privacy Norms Around LLM-Based Chatbots: A Contextual Integrity Perspective, 2025. URL https: //doi.org/10.7910/DVN/M6ABJ3

  39. [39]

    The need for a socially-grounded persona framework for user simulation.arXiv preprint arXiv:2601.07110, 2026

    Pranav Narayanan Venkit, Yu Li, Yada Pruksachatkun, and Chien-Sheng Wu. The need for a socially-grounded persona framework for user simulation.arXiv preprint arXiv:2601.07110, 2026

  40. [40]

    i regretted the minute i pressed share

    Yang Wang, Gregory Norcie, Saranga Komanduri, Alessandro Acquisti, Pedro Giovanni Leon, and Lorrie Faith Cranor. “i regretted the minute i pressed share” a qualitative study of regrets on facebook. InProceedings of the seventh symposium on usable privacy and security, pages 1–16, 2011

  41. [41]

    The feasibility of dynamically granted permissions: Aligning mobile privacy with user preferences

    Primal Wijesekera, Arjun Baokar, Lynn Tsai, Joel Reardon, Serge Egelman, David Wagner, and Konstantin Beznosov. The feasibility of dynamically granted permissions: Aligning mobile privacy with user preferences. In2017 IEEE Symposium on Security and Privacy (SP), pages 1077–1093. IEEE, 2017

  42. [42]

    Towards automating data access permissions in ai agents,

    Yuhao Wu, Ke Yang, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. Towards automating data access permissions in ai agents.arXiv preprint arXiv:2511.17959, 2025

  43. [43]

    it’s a fair game

    Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, and Tianshi Li. “it’s a fair game”, or is it? examining how users navigate disclosure risks and benefits when using llm-based conversational agents. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pages 1–26, 2024

  44. [44]

    ai is from the devil

    Noé Zufferey, Sarah Abdelwahab Gaballah, Karola Marky, and Verena Zimmermann. “ai is from the devil.” behaviors and concerns toward personal data sharing with llm-based conversational agents.Proceedings on Privacy Enhancing Technologies, 2025(3):5–28, 2025. 12 Appendix Contents A Persona Prompts 14 A.1 LLM Healthcare Consultation . . . . . . . . . . . . ....

  45. [49]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE 14 Imagine...

  46. [54]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE 15 In year...

  47. [59]

    Q1": "<integer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<integer>", "Q2": "<integer>", "Q3": "<integer>", "Q4": "<integer>", "Q5": "<integer>", "Q6": "<integer>", "Q7": "<integer>", "Q8": "<integer>", "Q9": "<integer>", "Q10": "<integer>" } QUESTIONNAIRE ...

  48. [60]

    Answer all data-sharing scenarios in the QUESTIONNAIRE below from the perspective of the User ,→Persona. 18

  49. [64]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>" } QUESTIONNAIRE Please rate your level of agreement with the following statements...

  50. [69]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE In the fol...

  51. [70]

    Identify the perceived BENEFITS of sharing -- service utility, personalization quality, convenience, ,→time saved, social or relational value

  52. [71]

    Identify the perceived RISKS or COSTS of sharing -- loss of control over the data, downstream reuse ,→or profiling, embarrassment, identifiability, future harm, regulatory or employer exposure

  53. [72]

    Weigh the two through the lens of this person’s stated attitudes and prior experiences; their answer ,→should reflect whichever side they judge to dominate

  54. [73]

    Pick the answer this person would arrive at after this tradeoff, not the answer a privacy-maximizing or ,→utility-maximizing agent would give

    When benefits clearly outweigh perceived risks for them, lean toward acceptance; when risks ,→dominate, lean toward refusal; when the two are comparable, lean toward a conditional or ,→intermediate option. Pick the answer this person would arrive at after this tradeoff, not the answer a privacy-maximizing or ,→utility-maximizing agent would give. TASK & C...

  55. [78]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE In the fol...

  56. [83]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. 23 TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE In the ...

  57. [84]

    Answer all data-sharing scenarios in the QUESTIONNAIRE below from the perspective of the User ,→Persona

  58. [85]

    Return ONLY valid, parseable JSON

  59. [86]

    DO NOT wrap your response in markdown code blocks (e.g., do not use ‘‘‘json ... ‘‘‘)

  60. [87]

    DO NOT add any commentary, greetings, or explanations outside the JSON object

  61. [88]

    Q1": "<answer>

    Keep field names, nesting, and data types strictly aligned with the TARGET OUTPUT FORMAT. TARGET OUTPUT FORMAT (STRUCTURE ONLY - DO NOT COPY VALUES) { "Q1": "<answer>", "Q2": "<answer>", "Q3": "<answer>", "Q4": "<answer>", "Q5": "<answer>", "Q6": "<answer>", "Q7": "<answer>", "Q8": "<answer>", "Q9": "<answer>", "Q10": "<answer>" } QUESTIONNAIRE 25 In the ...