Can LLMs Infer Conversational Agent Users' Personality Traits from Chat History?

Derya C\"ogendez; No\'e Zufferey; Verena Zimmermann

arxiv: 2604.19785 · v2 · submitted 2026-03-31 · 💻 cs.CL · cs.AI· cs.CR· cs.CY

Can LLMs Infer Conversational Agent Users' Personality Traits from Chat History?

Derya C\"ogendez , Verena Zimmermann , No\'e Zufferey This is my paper

Pith reviewed 2026-05-13 23:01 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.CRcs.CY

keywords personality inferenceprivacy risksconversational agentschat historylarge language modelstrait classificationRoBERTaChatGPT logs

0 comments

The pith

Fine-tuned models infer personality traits like extraversion from ChatGPT chat histories at rates above random guessing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper collects 62,090 real chat logs from 668 users and trains RoBERTa-base classifiers to predict Big Five personality traits from those interactions. In several cases the models exceed random baselines on ternary classification, with the largest gain shown for extraversion on logs involving relationships and personal reflection. A reader would care because everyday conversations with AI agents may leak stable personal attributes without any explicit query, creating avenues for influence or misuse of derived information.

Core claim

We collected actual ChatGPT logs from N=668 participants containing 62,090 individual chats and fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions. The findings show that these models achieve trait inference with accuracy (ternary classification) better than random in multiple cases. For example, for extraversion, accuracy improves by +44% relative to the baseline on interactions for relationships and personal reflection.

What carries the argument

RoBERTa-base text classification models fine-tuned on conversational chat logs to predict personality traits via ternary classification.

If this is right

Interactions about relationships and personal reflection carry higher risk for inferring extraversion than other use cases.
Privacy risks vary systematically by conversation type rather than being uniform across all chats.
Derived personality information can be obtained without users stating their traits directly.
The same logs that support normal use also enable trait inference at measurable rates above chance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Users may want to treat all exchanges with conversational agents as potentially revealing rather than private.
Similar inference pipelines could be tested on other attributes such as emotional state or decision biases.
Service providers could add on-device filtering or noise injection to reduce unintended trait leakage.
Policy discussions around AI privacy should include derived personal data obtained from interaction patterns.

Load-bearing premise

The collected chat logs contain sufficient and unbiased signals of personality traits, and the participants' self-reported personality labels are accurate and stable enough to serve as ground truth.

What would settle it

A replication that pairs the same chat logs with personality labels obtained independently (for example through observer ratings rather than self-report) and finds that model accuracy drops to random levels.

Figures

Figures reproduced from arXiv: 2604.19785 by Derya C\"ogendez, No\'e Zufferey, Verena Zimmermann.

**Figure 2.** Figure 2: Number of times participants interacted with ChatGPT for specific use cases (N=668). [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

read the original abstract

Sensitive information, such as knowledge about an individual's personality, can be can be misused to influence behavior (e.g., via personalized messaging). To assess to what extent an individual's personality can be inferred from user interactions with LLM-based conversational agents (CAs), we analyze and quantify related privacy risks of using CAs. We collected actual ChatGPT logs from N=668 participants, containing 62,090 individual chats, and report statistics about the different types of shared data and use cases. We fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions. The findings show that these models achieve trait inference with accuracy (ternary classification) better than random in multiple cases. For example, for extraversion, accuracy improves by +44% relative to the baseline on interactions for relationships and personal reflection. This research highlights how interactions with CAs pose privacy risks and provides fine-grained insights into the level of risk associated with different types of interactions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Real ChatGPT logs let fine-tuned RoBERTa beat random on some trait inferences, but the abstract leaves out every detail needed to judge the numbers.

read the letter

The paper shows that fine-tuned RoBERTa models can classify personality traits from actual user chats with ChatGPT at rates above random guessing, with the largest reported lift for extraversion on relationship and reflection topics. That is the central empirical claim. The work stands out for using real logs from 668 participants and 62,090 chats rather than synthetic examples, and for breaking results down by interaction category. This gives a concrete picture of where privacy leakage might be higher in everyday use. The dataset description and topic categorization are straightforward and useful for anyone tracking data exposure in conversational agents. The approach sticks to standard text classification, which makes the setup easy to understand and potentially replicable. The soft spots are the missing pieces that matter most for the result. The abstract supplies no information on how personality was measured, what instrument was used, how participants were recruited, what the train-test splits were, or whether any statistical tests were run. Self-reported labels can be noisy or unstable, so the reported accuracy gains could partly reflect label artifacts instead of real conversational signals. Without those details the +44% relative improvement cannot be evaluated properly. This paper is aimed at researchers working on AI privacy risks and data protection. Someone studying inference attacks or regulatory questions would find the topic-specific numbers worth looking at as an initial measurement. It deserves a serious referee to check the full methods, data handling, and statistical reporting before the claims can be taken as solid.

Referee Report

2 major / 1 minor

Summary. The paper claims that LLMs can infer users' personality traits from interactions with conversational agents. Using real ChatGPT logs collected from N=668 participants (62,090 chats), the authors fine-tune RoBERTa-base models for ternary classification of personality traits and report accuracies better than random baselines, including a +44% relative improvement for extraversion on interactions involving relationships and personal reflection. The work quantifies privacy risks associated with different types of CA usage.

Significance. If the empirical results hold after methodological clarification, the paper would offer concrete evidence of privacy vulnerabilities in LLM-based conversational agents, with fine-grained analysis of risk levels across interaction categories. The use of actual user logs rather than synthetic data is a positive aspect that could strengthen the contribution to privacy research in NLP.

major comments (2)

[Abstract] Abstract: The central claim of better-than-random ternary classification accuracy (e.g., +44% relative gain for extraversion) rests on the quality of the personality labels as ground truth. The abstract supplies no information on the assessment instrument, recruitment procedure, label reliability (test-retest or behavioral validation), or potential confounds, making it impossible to determine whether reported lifts reflect conversational signals or label artifacts.
[Abstract] Abstract: No description is given of the train/test split strategy, baseline construction, or statistical tests confirming that accuracies exceed chance. These omissions are load-bearing because the headline result cannot be verified or replicated without them.

minor comments (1)

[Abstract] Abstract: Typographical error in the sentence 'Sensitive information, such as knowledge about an individual's personality, can be can be misused'.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the careful reading and constructive comments. We have revised the abstract to supply the requested methodological details on label construction, data partitioning, and evaluation. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of better-than-random ternary classification accuracy (e.g., +44% relative gain for extraversion) rests on the quality of the personality labels as ground truth. The abstract supplies no information on the assessment instrument, recruitment procedure, label reliability (test-retest or behavioral validation), or potential confounds, making it impossible to determine whether reported lifts reflect conversational signals or label artifacts.

Authors: We agree the abstract was too terse. The revised abstract now states that traits were measured with the 44-item Big Five Inventory (BFI) via self-report, that participants were recruited through online platforms with informed consent, and that labels were obtained by tertile splits. The full manuscript already describes topic-level controls and reports consistent gains across multiple categories, which would be unlikely under pure label noise. We note that no additional test-retest or behavioral validation was collected for this sample. revision: yes
Referee: [Abstract] Abstract: No description is given of the train/test split strategy, baseline construction, or statistical tests confirming that accuracies exceed chance. These omissions are load-bearing because the headline result cannot be verified or replicated without them.

Authors: We accept the criticism. The revised abstract now indicates that an 80/20 user-stratified split was used to avoid leakage, that the random baseline reflects the majority-class proportion, and that improvements were assessed with statistical tests against chance. These details were already present in the methods and results sections; the abstract has been expanded to make them visible at a glance. revision: yes

standing simulated objections not resolved

Test-retest reliability or behavioral validation of the personality labels beyond standard BFI self-report administration

Circularity Check

0 steps flagged

No circularity: empirical supervised inference on collected logs and external labels

full rationale

The paper's core result is obtained by collecting 62,090 real ChatGPT logs from 668 participants, obtaining separate personality trait labels (presumably via standard self-report instruments), fine-tuning RoBERTa-base classifiers on the chat text, and measuring ternary classification accuracy against held-out data. Reported lifts such as +44% relative improvement for extraversion are direct performance metrics on this pipeline. No step redefines the target labels in terms of the model's outputs, renames a fitted parameter as a prediction, or relies on a self-citation chain to establish uniqueness or an ansatz. The derivation is therefore self-contained and externally falsifiable.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard NLP transfer learning assumptions and the empirical performance of fine-tuned models on the collected dataset; no explicit free parameters, invented entities, or non-standard axioms are stated.

axioms (1)

standard math RoBERTa-base can be fine-tuned for ternary text classification of personality traits from chat logs
Standard assumption in NLP transfer learning literature.

pith-pipeline@v0.9.0 · 5478 in / 1187 out tokens · 53613 ms · 2026-05-13T23:01:32.956448+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions... accuracy (ternary classification) better than random
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We collected actual ChatGPT logs from N=668 participants, containing 62,090 individual chats

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Inferential Privacy Leakage in Anonymized Conversational AI Logs
cs.CY 2026-05 unverdicted novelty 6.0

LLM-based inference recovers user age, gender, and country from filtered ChatGPT logs at weighted F1 scores of 0.84-0.90, with median identification from the first 5% of history, driven by stereotype patterns.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Personality and Individ- ual Differences87, 147–152 (Dec 2015).https://doi.org/10.1016/j.paid.2015.07.037,https: //www.sciencedirect.com/science/article/pii/S0191886915004948

Alkı¸ s, N., Ta¸ skaya Temizel, T.: The impact of individual differences on influence strategies. Personality and Individ- ual Differences87, 147–152 (Dec 2015).https://doi.org/10.1016/j.paid.2015.07.037,https: //www.sciencedirect.com/science/article/pii/S0191886915004948

work page doi:10.1016/j.paid.2015.07.037 2015
[2]

In: Proc

Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., Rieck, K.: Dos and Don’ts of Machine Learning in Computer Security. In: Proc. of the USENIX Security Symp. USENIX As- sociation (2020).https://doi.org/https://doi.org/10.48550/arXiv.2010.09470, zotAbbrevi- ate:no proceedingsTitle

work page doi:10.48550/arxiv.2010.09470 2020
[3]

How People Use ChatGPT

Chatterji, A., Cunningham, T., Deming, D.J., Hitzig, Z., Ong, C., Shan, C.Y ., Wadman, K.: How people use chatgpt. Working Paper 34255, National Bureau of Economic Research (2025).https://doi.org/10.3386/w34255, http://www.nber.org/papers/w34255

work page doi:10.3386/w34255 2025
[4]

Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub

Chorley, M.J., Whitaker, R.M., Allen, S.M.: Personality and location-based social networks. Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub. elsevier.com/retrieve/pii/S0747563214007559

work page doi:10.1016/j.chb.2014.12.038 2015
[5]

Economics Letters (1) (2012)

Cobb-Clark, D.A., Schurer, S.: The stability of big-five personality traits. Economics Letters (1) (2012). https://doi.org/10.1016/j.econlet.2011.11.015,https://linkinghub.elsevier. com/retrieve/pii/S0165176511004666

work page doi:10.1016/j.econlet.2011.11.015 2012
[6]

koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection

Dardikman, I.: 8 Million Users’ AI Conversations Sold for Profit by "Privacy" Extensions (2025),https://www. koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection

work page 2025
[7]

Duarte, F.: Number of ChatGPT Users (January 2026) (2026),https://explodingtopics.com/blog/ chatgpt-users

work page 2026
[8]

Cögendez et al

Duhigg, C.: What Does Your Credit-Card Company Know About You? The New York Times (May 2009),https: //www.nytimes.com/2009/05/17/magazine/17credit-t.html 14 D. Cögendez et al

work page 2009
[9]

Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021

El-Demerdash, K., El-Khoribi, R.A., Ismail Shoman, M.A., Abdou, S.: Deep learning based fusion strategies for personality prediction. Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021. 05.004,https://linkinghub.elsevier.com/retrieve/pii/S1110866521000311

work page doi:10.1016/j.eij.2021 2022
[10]

Nature (2018), https://www.nature.com/articles/d41586-018-03880-4

Gibney, E.: The scant science behind Cambridge Analytica’s controversial marketing techniques. Nature (2018), https://www.nature.com/articles/d41586-018-03880-4

work page 2018
[11]

Journal of Research in Personality (6) (2003).https://doi.org/10.1016/S0092-6566(03)00046-1,https: //linkinghub.elsevier.com/retrieve/pii/S0092656603000461

Gosling, S.D., Rentfrow, P.J., Swann, W.B.: A very brief measure of the Big-Five personality domains. Journal of Research in Personality (6) (2003).https://doi.org/10.1016/S0092-6566(03)00046-1,https: //linkinghub.elsevier.com/retrieve/pii/S0092656603000461

work page doi:10.1016/s0092-6566(03)00046-1 2003
[12]

In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS)

Hassanein, M., Rady, S., Hussein, W., Gharib, T.F.: Predicting the Big Five for social network users using their personality characteristics. In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS). pp. 160–164 (Dec 2021).https://doi.org/10.1109/ICICIS52592.2021.9694160, https://ieeexplore.ieee.org/document/9694160

work page doi:10.1109/icicis52592.2021.9694160 2021
[13]

Psychological Science (2012).https:// doi.org/10.1177/0956797611436349,https://sci-hub.st/10.1177/0956797611436349

Hirsh, J.B., Kang, S.K., Bodenhausen, G.V .: Personalized Persuasion. Psychological Science (2012).https:// doi.org/10.1177/0956797611436349,https://sci-hub.st/10.1177/0956797611436349

work page doi:10.1177/0956797611436349 2012
[14]

Machine Learning (3) (2014).https://doi.org/10.1007/ s10994-013-5415-y,http://link.springer.com/10.1007/s10994-013-5415-y

Kosinski, M., Bachrach, Y ., Kohli, P., Stillwell, D., Graepel, T.: Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning (3) (2014).https://doi.org/10.1007/ s10994-013-5415-y,http://link.springer.com/10.1007/s10994-013-5415-y

work page doi:10.1007/s10994-013-5415-y 2014
[15]

Neural Networks (2014).https://doi.org/10.1016/j.neunet.2014.05.020, https://linkinghub.elsevier.com/retrieve/pii/S0893608014001282

Lima, A.C.E., de Castro, L.N.: A multi-label, semi-supervised classification approach applied to personality pre- diction in social media. Neural Networks (2014).https://doi.org/10.1016/j.neunet.2014.05.020, https://linkinghub.elsevier.com/retrieve/pii/S0893608014001282

work page doi:10.1016/j.neunet.2014.05.020 2014
[16]

Hoovered up as a data point

Malki, L.M., Polamarasetty, A., Hatamian, M., Costanza, E., Warner, M.: “Hoovered up as a data point”: Exploring Privacy Behaviours, Awareness, and Concerns Among UK Users of LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)

work page 2025
[17]

Journal of Personality Assessment (1) (2019).https://doi.org/10

Maples-Keller, J.L., Williamson, R.L., Sleep, C.E., Carter, N.T., Campbell, W.K., Miller, J.D.: Using Item Re- sponse Theory to Develop a 60-Item Representation of the NEO PI–R Using the International Personality Item Pool: Development of the IPIP–NEO–60. Journal of Personality Assessment (1) (2019).https://doi.org/10. 1080/00223891.2017.1381968,https://d...

work page doi:10.1080/00223891.2017.1381968 2019
[18]

Matz, Jacob D

Matz, S.C., Teeny, J.D., Vaid, S.S., Peters, H., Harari, G.M., Cerf, M.: The potential of generative AI for personalized persuasion at scale. Scientific Reports (1) (2024).https://doi.org/10.1038/s41598-024-53755-0, https://www.nature.com/articles/s41598-024-53755-0

work page doi:10.1038/s41598-024-53755-0 2024
[19]

Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05

McCrae, R.R., Costa, Jr., P.T., Martin, T.A.: The NEO–PI–3: A More Readable Revised NEO Personality Inventory. Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05

work page doi:10.1207/s15327752jpa8403_05 2005
[20]

McMahon, L.: Hundreds of thousands of Grok chats exposed in Google results (2025),https://www.bbc.com/ news/articles/cdrkmk00jy0o

work page 2025
[21]

CSCW (2011)

Minamikawa, A., Yokoyama, H.: Blog tells what kind of personality you have: egogram estimation from Japanese weblog. CSCW (2011)

work page 2011
[22]

In: Natural Language Processing and Information Systems

Molchanova, M.: Exploring the Potential of Large Language Models for Text-Based Personality Prediction. In: Natural Language Processing and Information Systems. Springer Nature Switzerland, Cham (2024),https: //link.springer.com/10.1007/978-3-031-70242-6_28

work page doi:10.1007/978-3-031-70242-6_28 2024
[23]

Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www

Mønsted, B., Mollgaard, A., Mathiesen, J.: Phone-based metric as a predictor for basic personality traits. Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www. sciencedirect.com/science/article/pii/S0092656618300011

work page doi:10.1016/j.jrp.2017.12.004 2018
[24]

PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic

Peters, H., Matz, S.C.: Large language models can infer psychological dispositions of social media users. PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic. oup.com/pnasnexus/article/doi/10.1093/pnasnexus/pgae231/7692212

work page doi:10.1093/pnasnexus/pgae231 2024
[25]

PLOS ONE (3) (2015).https://doi.org/10.1371/journal.pone.0122245,https://journals.plos.org/ plosone/article?id=10.1371/journal.pone.0122245

Rentfrow, P.J., Jokela, M., Lamb, M.E.: Regional Personality Differences in Great Britain. PLOS ONE (3) (2015).https://doi.org/10.1371/journal.pone.0122245,https://journals.plos.org/ plosone/article?id=10.1371/journal.pone.0122245

work page doi:10.1371/journal.pone.0122245 2015
[26]

Personality and Social Psychology Bulletin (6) (2002).https://doi.org/10.1177/0146167202289008,https: //doi.org/10.1177/0146167202289008

Roccas, S., Sagiv, L., Schwartz, S.H., Knafo, A.: The Big Five Personality Factors and Personal Values. Personality and Social Psychology Bulletin (6) (2002).https://doi.org/10.1177/0146167202289008,https: //doi.org/10.1177/0146167202289008

work page doi:10.1177/0146167202289008 2002
[27]

Journal of Cross-Cultural Psychology (2007).https://doi.org/10.1177//0022022106297299 Inferring Conversational Agent Users’ Personality Traits 15

Schmitt, D., McCrae, R., Benet, V ., Alcalay, L., Ault, L., Austers, I., Bianchi, G., Boholst, F., Cunen, M., Braeckman, J., Jr, E., Caral, L.G., Caron, G., Casullo, M., Cunningham, M., Daibo, I., de backer, C., Zupanèiè, A.: The geographic distribution of big five personality traits: Patterns and profiles of human self-description across 56 nations. Jour...

work page doi:10.1177//0022022106297299 2007
[28]

Simo, F.: Our approach to advertising and expanding access to chatgpt (2026),https://openai.com/index/ our-approach-to-advertising-and-expanding-access/

work page 2026
[29]

Sims, D.: Thousands of private ChatGPT conversations found via Google search after feature mishap (2025),https://www.techspot.com/news/ 108911-thousands-private-chatgpt-conversations-found-google-search-after. html

work page 2025
[30]

Staab, R., Vero, M., Balunovic, M., Vechev, M.: Beyond Memorization: Violating Privacy via Inference with Large Language Models (2023),https://openreview.net/forum?id=kmn0BhQk7p

work page 2023
[31]

Proceedings of the National Academy of Sciences (30) (2020).https://doi.org/10.1073/ pnas.1920484117,https://www.pnas.org/content/117/30/17680

Stachl, C., Au, Q., Schoedel, R., Gosling, S.D., Harari, G.M., Buschek, D., Völkel, S.T., Schuwerk, T., Oldemeier, M., Ullmann, T., Hussmann, H., Bischl, B., Bühner, M.: Predicting personality from patterns of behavior collected with smartphones. Proceedings of the National Academy of Sciences (30) (2020).https://doi.org/10.1073/ pnas.1920484117,https://w...

work page 2020
[32]

Psychol- ogy, Health & Medicine (3) (2020).https://doi.org/10.1080/13548506.2019.1687918,https: //www.tandfonline.com/doi/full/10.1080/13548506.2019.1687918

Weston, S.J., Edmonds, G.W., Hill, P.L.: Personality traits predict dietary habits in middle-to-older adults. Psychol- ogy, Health & Medicine (3) (2020).https://doi.org/10.1080/13548506.2019.1687918,https: //www.tandfonline.com/doi/full/10.1080/13548506.2019.1687918

work page doi:10.1080/13548506.2019.1687918 2020
[33]

Westwood, S.J., Druckman, E.J.N., Levi, M.: The potential existential threat of large language models to online survey research (2025)

work page 2025
[34]

Nature Human Behaviour pp

Wright, A.G.C., Ringwald, W.R., Vize, C.E., Eichstaedt, J.C., Angstadt, M., Taxali, A., Sripada, C.: Assess- ing personality using zero-shot generative AI scoring of brief open-ended text. Nature Human Behaviour pp. 1– 15 (Jan 2026).https://doi.org/10.1038/s41562-025-02389-x,https://www.nature.com/ articles/s41562-025-02389-x

work page doi:10.1038/s41562-025-02389-x 2026
[35]

Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Gao, C., Huang, C., Lv, C., Zheng, C., Liu, D., Zhou, F., Huang, F., Hu, F., Ge, H., Wei, H., Lin, H., Tang, J., Yang, J., Tu, J., Zhang, J., Yang, J., Yang, J., Zhou, J., Zhou, J., Lin, J., Dang, K., Bao, K., Yang, K., Yu, L., Deng, L., Li, M., Xue, M., Li, M., Zhang, P., Wang, P., Zhu, Q...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[36]

Zhu, Y ., Hu, L., Ning, N., Zhang, W., Wu, B.: A lexical psycholinguistic knowledge-guided graph neural network for interpretable personality detection (2022)

work page 2022
[37]

In: Proceedings of the 20th Chinese National Conference on Computational Linguistics

Zhuang, L., Wayne, L., Ya, S., Jun, Z.: A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics. Chinese Information Processing Society of China, Huhhot, China (2021),https://aclanthology.org/2021.ccl-1.108/

work page 2021
[38]

AI is from the devil

Zufferey, N., Gaballah, S.A., Marky, K., Zimmermann, V .: “AI is from the devil.” Behaviors and Concerns Toward Personal Data Sharing with LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)

work page 2025
[39]

openness

Zufferey, N., Humbert, M., Tavenard, R., Huguenin, K.: Watch your Watch: Inferring Personality Traits from Wearable Activity Trackers. In: Proc. of the USENIX Security Symp. USENIX Association (2023),https: //www.usenix.org/conference/usenixsecurity23/presentation/zufferey 16 D. Cögendez et al. A Dataset A.1 User Chats total mean median std min max 62090 ...

work page 2023

[1] [1]

Personality and Individ- ual Differences87, 147–152 (Dec 2015).https://doi.org/10.1016/j.paid.2015.07.037,https: //www.sciencedirect.com/science/article/pii/S0191886915004948

Alkı¸ s, N., Ta¸ skaya Temizel, T.: The impact of individual differences on influence strategies. Personality and Individ- ual Differences87, 147–152 (Dec 2015).https://doi.org/10.1016/j.paid.2015.07.037,https: //www.sciencedirect.com/science/article/pii/S0191886915004948

work page doi:10.1016/j.paid.2015.07.037 2015

[2] [2]

In: Proc

Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., Rieck, K.: Dos and Don’ts of Machine Learning in Computer Security. In: Proc. of the USENIX Security Symp. USENIX As- sociation (2020).https://doi.org/https://doi.org/10.48550/arXiv.2010.09470, zotAbbrevi- ate:no proceedingsTitle

work page doi:10.48550/arxiv.2010.09470 2020

[3] [3]

How People Use ChatGPT

Chatterji, A., Cunningham, T., Deming, D.J., Hitzig, Z., Ong, C., Shan, C.Y ., Wadman, K.: How people use chatgpt. Working Paper 34255, National Bureau of Economic Research (2025).https://doi.org/10.3386/w34255, http://www.nber.org/papers/w34255

work page doi:10.3386/w34255 2025

[4] [4]

Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub

Chorley, M.J., Whitaker, R.M., Allen, S.M.: Personality and location-based social networks. Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub. elsevier.com/retrieve/pii/S0747563214007559

work page doi:10.1016/j.chb.2014.12.038 2015

[5] [5]

Economics Letters (1) (2012)

Cobb-Clark, D.A., Schurer, S.: The stability of big-five personality traits. Economics Letters (1) (2012). https://doi.org/10.1016/j.econlet.2011.11.015,https://linkinghub.elsevier. com/retrieve/pii/S0165176511004666

work page doi:10.1016/j.econlet.2011.11.015 2012

[6] [6]

koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection

Dardikman, I.: 8 Million Users’ AI Conversations Sold for Profit by "Privacy" Extensions (2025),https://www. koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection

work page 2025

[7] [7]

Duarte, F.: Number of ChatGPT Users (January 2026) (2026),https://explodingtopics.com/blog/ chatgpt-users

work page 2026

[8] [8]

Cögendez et al

Duhigg, C.: What Does Your Credit-Card Company Know About You? The New York Times (May 2009),https: //www.nytimes.com/2009/05/17/magazine/17credit-t.html 14 D. Cögendez et al

work page 2009

[9] [9]

Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021

El-Demerdash, K., El-Khoribi, R.A., Ismail Shoman, M.A., Abdou, S.: Deep learning based fusion strategies for personality prediction. Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021. 05.004,https://linkinghub.elsevier.com/retrieve/pii/S1110866521000311

work page doi:10.1016/j.eij.2021 2022

[10] [10]

Nature (2018), https://www.nature.com/articles/d41586-018-03880-4

Gibney, E.: The scant science behind Cambridge Analytica’s controversial marketing techniques. Nature (2018), https://www.nature.com/articles/d41586-018-03880-4

work page 2018

[11] [11]

Journal of Research in Personality (6) (2003).https://doi.org/10.1016/S0092-6566(03)00046-1,https: //linkinghub.elsevier.com/retrieve/pii/S0092656603000461

Gosling, S.D., Rentfrow, P.J., Swann, W.B.: A very brief measure of the Big-Five personality domains. Journal of Research in Personality (6) (2003).https://doi.org/10.1016/S0092-6566(03)00046-1,https: //linkinghub.elsevier.com/retrieve/pii/S0092656603000461

work page doi:10.1016/s0092-6566(03)00046-1 2003

[12] [12]

In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS)

Hassanein, M., Rady, S., Hussein, W., Gharib, T.F.: Predicting the Big Five for social network users using their personality characteristics. In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS). pp. 160–164 (Dec 2021).https://doi.org/10.1109/ICICIS52592.2021.9694160, https://ieeexplore.ieee.org/document/9694160

work page doi:10.1109/icicis52592.2021.9694160 2021

[13] [13]

Psychological Science (2012).https:// doi.org/10.1177/0956797611436349,https://sci-hub.st/10.1177/0956797611436349

Hirsh, J.B., Kang, S.K., Bodenhausen, G.V .: Personalized Persuasion. Psychological Science (2012).https:// doi.org/10.1177/0956797611436349,https://sci-hub.st/10.1177/0956797611436349

work page doi:10.1177/0956797611436349 2012

[14] [14]

Machine Learning (3) (2014).https://doi.org/10.1007/ s10994-013-5415-y,http://link.springer.com/10.1007/s10994-013-5415-y

Kosinski, M., Bachrach, Y ., Kohli, P., Stillwell, D., Graepel, T.: Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning (3) (2014).https://doi.org/10.1007/ s10994-013-5415-y,http://link.springer.com/10.1007/s10994-013-5415-y

work page doi:10.1007/s10994-013-5415-y 2014

[15] [15]

Neural Networks (2014).https://doi.org/10.1016/j.neunet.2014.05.020, https://linkinghub.elsevier.com/retrieve/pii/S0893608014001282

Lima, A.C.E., de Castro, L.N.: A multi-label, semi-supervised classification approach applied to personality pre- diction in social media. Neural Networks (2014).https://doi.org/10.1016/j.neunet.2014.05.020, https://linkinghub.elsevier.com/retrieve/pii/S0893608014001282

work page doi:10.1016/j.neunet.2014.05.020 2014

[16] [16]

Hoovered up as a data point

Malki, L.M., Polamarasetty, A., Hatamian, M., Costanza, E., Warner, M.: “Hoovered up as a data point”: Exploring Privacy Behaviours, Awareness, and Concerns Among UK Users of LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)

work page 2025

[17] [17]

Journal of Personality Assessment (1) (2019).https://doi.org/10

Maples-Keller, J.L., Williamson, R.L., Sleep, C.E., Carter, N.T., Campbell, W.K., Miller, J.D.: Using Item Re- sponse Theory to Develop a 60-Item Representation of the NEO PI–R Using the International Personality Item Pool: Development of the IPIP–NEO–60. Journal of Personality Assessment (1) (2019).https://doi.org/10. 1080/00223891.2017.1381968,https://d...

work page doi:10.1080/00223891.2017.1381968 2019

[18] [18]

Matz, Jacob D

Matz, S.C., Teeny, J.D., Vaid, S.S., Peters, H., Harari, G.M., Cerf, M.: The potential of generative AI for personalized persuasion at scale. Scientific Reports (1) (2024).https://doi.org/10.1038/s41598-024-53755-0, https://www.nature.com/articles/s41598-024-53755-0

work page doi:10.1038/s41598-024-53755-0 2024

[19] [19]

Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05

McCrae, R.R., Costa, Jr., P.T., Martin, T.A.: The NEO–PI–3: A More Readable Revised NEO Personality Inventory. Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05

work page doi:10.1207/s15327752jpa8403_05 2005

[20] [20]

McMahon, L.: Hundreds of thousands of Grok chats exposed in Google results (2025),https://www.bbc.com/ news/articles/cdrkmk00jy0o

work page 2025

[21] [21]

CSCW (2011)

Minamikawa, A., Yokoyama, H.: Blog tells what kind of personality you have: egogram estimation from Japanese weblog. CSCW (2011)

work page 2011

[22] [22]

In: Natural Language Processing and Information Systems

Molchanova, M.: Exploring the Potential of Large Language Models for Text-Based Personality Prediction. In: Natural Language Processing and Information Systems. Springer Nature Switzerland, Cham (2024),https: //link.springer.com/10.1007/978-3-031-70242-6_28

work page doi:10.1007/978-3-031-70242-6_28 2024

[23] [23]

Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www

Mønsted, B., Mollgaard, A., Mathiesen, J.: Phone-based metric as a predictor for basic personality traits. Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www. sciencedirect.com/science/article/pii/S0092656618300011

work page doi:10.1016/j.jrp.2017.12.004 2018

[24] [24]

PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic

Peters, H., Matz, S.C.: Large language models can infer psychological dispositions of social media users. PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic. oup.com/pnasnexus/article/doi/10.1093/pnasnexus/pgae231/7692212

work page doi:10.1093/pnasnexus/pgae231 2024

[25] [25]

PLOS ONE (3) (2015).https://doi.org/10.1371/journal.pone.0122245,https://journals.plos.org/ plosone/article?id=10.1371/journal.pone.0122245

Rentfrow, P.J., Jokela, M., Lamb, M.E.: Regional Personality Differences in Great Britain. PLOS ONE (3) (2015).https://doi.org/10.1371/journal.pone.0122245,https://journals.plos.org/ plosone/article?id=10.1371/journal.pone.0122245

work page doi:10.1371/journal.pone.0122245 2015

[26] [26]

Personality and Social Psychology Bulletin (6) (2002).https://doi.org/10.1177/0146167202289008,https: //doi.org/10.1177/0146167202289008

Roccas, S., Sagiv, L., Schwartz, S.H., Knafo, A.: The Big Five Personality Factors and Personal Values. Personality and Social Psychology Bulletin (6) (2002).https://doi.org/10.1177/0146167202289008,https: //doi.org/10.1177/0146167202289008

work page doi:10.1177/0146167202289008 2002

[27] [27]

Journal of Cross-Cultural Psychology (2007).https://doi.org/10.1177//0022022106297299 Inferring Conversational Agent Users’ Personality Traits 15

Schmitt, D., McCrae, R., Benet, V ., Alcalay, L., Ault, L., Austers, I., Bianchi, G., Boholst, F., Cunen, M., Braeckman, J., Jr, E., Caral, L.G., Caron, G., Casullo, M., Cunningham, M., Daibo, I., de backer, C., Zupanèiè, A.: The geographic distribution of big five personality traits: Patterns and profiles of human self-description across 56 nations. Jour...

work page doi:10.1177//0022022106297299 2007

[28] [28]

Simo, F.: Our approach to advertising and expanding access to chatgpt (2026),https://openai.com/index/ our-approach-to-advertising-and-expanding-access/

work page 2026

[29] [29]

Sims, D.: Thousands of private ChatGPT conversations found via Google search after feature mishap (2025),https://www.techspot.com/news/ 108911-thousands-private-chatgpt-conversations-found-google-search-after. html

work page 2025

[30] [30]

Staab, R., Vero, M., Balunovic, M., Vechev, M.: Beyond Memorization: Violating Privacy via Inference with Large Language Models (2023),https://openreview.net/forum?id=kmn0BhQk7p

work page 2023

[31] [31]

Proceedings of the National Academy of Sciences (30) (2020).https://doi.org/10.1073/ pnas.1920484117,https://www.pnas.org/content/117/30/17680

Stachl, C., Au, Q., Schoedel, R., Gosling, S.D., Harari, G.M., Buschek, D., Völkel, S.T., Schuwerk, T., Oldemeier, M., Ullmann, T., Hussmann, H., Bischl, B., Bühner, M.: Predicting personality from patterns of behavior collected with smartphones. Proceedings of the National Academy of Sciences (30) (2020).https://doi.org/10.1073/ pnas.1920484117,https://w...

work page 2020

[32] [32]

Psychol- ogy, Health & Medicine (3) (2020).https://doi.org/10.1080/13548506.2019.1687918,https: //www.tandfonline.com/doi/full/10.1080/13548506.2019.1687918

Weston, S.J., Edmonds, G.W., Hill, P.L.: Personality traits predict dietary habits in middle-to-older adults. Psychol- ogy, Health & Medicine (3) (2020).https://doi.org/10.1080/13548506.2019.1687918,https: //www.tandfonline.com/doi/full/10.1080/13548506.2019.1687918

work page doi:10.1080/13548506.2019.1687918 2020

[33] [33]

Westwood, S.J., Druckman, E.J.N., Levi, M.: The potential existential threat of large language models to online survey research (2025)

work page 2025

[34] [34]

Nature Human Behaviour pp

Wright, A.G.C., Ringwald, W.R., Vize, C.E., Eichstaedt, J.C., Angstadt, M., Taxali, A., Sripada, C.: Assess- ing personality using zero-shot generative AI scoring of brief open-ended text. Nature Human Behaviour pp. 1– 15 (Jan 2026).https://doi.org/10.1038/s41562-025-02389-x,https://www.nature.com/ articles/s41562-025-02389-x

work page doi:10.1038/s41562-025-02389-x 2026

[35] [35]

Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Gao, C., Huang, C., Lv, C., Zheng, C., Liu, D., Zhou, F., Huang, F., Hu, F., Ge, H., Wei, H., Lin, H., Tang, J., Yang, J., Tu, J., Zhang, J., Yang, J., Yang, J., Zhou, J., Zhou, J., Lin, J., Dang, K., Bao, K., Yang, K., Yu, L., Deng, L., Li, M., Xue, M., Li, M., Zhang, P., Wang, P., Zhu, Q...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[36] [36]

Zhu, Y ., Hu, L., Ning, N., Zhang, W., Wu, B.: A lexical psycholinguistic knowledge-guided graph neural network for interpretable personality detection (2022)

work page 2022

[37] [37]

In: Proceedings of the 20th Chinese National Conference on Computational Linguistics

Zhuang, L., Wayne, L., Ya, S., Jun, Z.: A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics. Chinese Information Processing Society of China, Huhhot, China (2021),https://aclanthology.org/2021.ccl-1.108/

work page 2021

[38] [38]

AI is from the devil

Zufferey, N., Gaballah, S.A., Marky, K., Zimmermann, V .: “AI is from the devil.” Behaviors and Concerns Toward Personal Data Sharing with LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)

work page 2025

[39] [39]

openness

Zufferey, N., Humbert, M., Tavenard, R., Huguenin, K.: Watch your Watch: Inferring Personality Traits from Wearable Activity Trackers. In: Proc. of the USENIX Security Symp. USENIX Association (2023),https: //www.usenix.org/conference/usenixsecurity23/presentation/zufferey 16 D. Cögendez et al. A Dataset A.1 User Chats total mean median std min max 62090 ...

work page 2023