Can LLMs Infer Conversational Agent Users' Personality Traits from Chat History?
Pith reviewed 2026-05-13 23:01 UTC · model grok-4.3
The pith
Fine-tuned models infer personality traits like extraversion from ChatGPT chat histories at rates above random guessing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We collected actual ChatGPT logs from N=668 participants containing 62,090 individual chats and fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions. The findings show that these models achieve trait inference with accuracy (ternary classification) better than random in multiple cases. For example, for extraversion, accuracy improves by +44% relative to the baseline on interactions for relationships and personal reflection.
What carries the argument
RoBERTa-base text classification models fine-tuned on conversational chat logs to predict personality traits via ternary classification.
If this is right
- Interactions about relationships and personal reflection carry higher risk for inferring extraversion than other use cases.
- Privacy risks vary systematically by conversation type rather than being uniform across all chats.
- Derived personality information can be obtained without users stating their traits directly.
- The same logs that support normal use also enable trait inference at measurable rates above chance.
Where Pith is reading between the lines
- Users may want to treat all exchanges with conversational agents as potentially revealing rather than private.
- Similar inference pipelines could be tested on other attributes such as emotional state or decision biases.
- Service providers could add on-device filtering or noise injection to reduce unintended trait leakage.
- Policy discussions around AI privacy should include derived personal data obtained from interaction patterns.
Load-bearing premise
The collected chat logs contain sufficient and unbiased signals of personality traits, and the participants' self-reported personality labels are accurate and stable enough to serve as ground truth.
What would settle it
A replication that pairs the same chat logs with personality labels obtained independently (for example through observer ratings rather than self-report) and finds that model accuracy drops to random levels.
Figures
read the original abstract
Sensitive information, such as knowledge about an individual's personality, can be can be misused to influence behavior (e.g., via personalized messaging). To assess to what extent an individual's personality can be inferred from user interactions with LLM-based conversational agents (CAs), we analyze and quantify related privacy risks of using CAs. We collected actual ChatGPT logs from N=668 participants, containing 62,090 individual chats, and report statistics about the different types of shared data and use cases. We fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions. The findings show that these models achieve trait inference with accuracy (ternary classification) better than random in multiple cases. For example, for extraversion, accuracy improves by +44% relative to the baseline on interactions for relationships and personal reflection. This research highlights how interactions with CAs pose privacy risks and provides fine-grained insights into the level of risk associated with different types of interactions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that LLMs can infer users' personality traits from interactions with conversational agents. Using real ChatGPT logs collected from N=668 participants (62,090 chats), the authors fine-tune RoBERTa-base models for ternary classification of personality traits and report accuracies better than random baselines, including a +44% relative improvement for extraversion on interactions involving relationships and personal reflection. The work quantifies privacy risks associated with different types of CA usage.
Significance. If the empirical results hold after methodological clarification, the paper would offer concrete evidence of privacy vulnerabilities in LLM-based conversational agents, with fine-grained analysis of risk levels across interaction categories. The use of actual user logs rather than synthetic data is a positive aspect that could strengthen the contribution to privacy research in NLP.
major comments (2)
- [Abstract] Abstract: The central claim of better-than-random ternary classification accuracy (e.g., +44% relative gain for extraversion) rests on the quality of the personality labels as ground truth. The abstract supplies no information on the assessment instrument, recruitment procedure, label reliability (test-retest or behavioral validation), or potential confounds, making it impossible to determine whether reported lifts reflect conversational signals or label artifacts.
- [Abstract] Abstract: No description is given of the train/test split strategy, baseline construction, or statistical tests confirming that accuracies exceed chance. These omissions are load-bearing because the headline result cannot be verified or replicated without them.
minor comments (1)
- [Abstract] Abstract: Typographical error in the sentence 'Sensitive information, such as knowledge about an individual's personality, can be can be misused'.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. We have revised the abstract to supply the requested methodological details on label construction, data partitioning, and evaluation. Point-by-point responses follow.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of better-than-random ternary classification accuracy (e.g., +44% relative gain for extraversion) rests on the quality of the personality labels as ground truth. The abstract supplies no information on the assessment instrument, recruitment procedure, label reliability (test-retest or behavioral validation), or potential confounds, making it impossible to determine whether reported lifts reflect conversational signals or label artifacts.
Authors: We agree the abstract was too terse. The revised abstract now states that traits were measured with the 44-item Big Five Inventory (BFI) via self-report, that participants were recruited through online platforms with informed consent, and that labels were obtained by tertile splits. The full manuscript already describes topic-level controls and reports consistent gains across multiple categories, which would be unlikely under pure label noise. We note that no additional test-retest or behavioral validation was collected for this sample. revision: yes
-
Referee: [Abstract] Abstract: No description is given of the train/test split strategy, baseline construction, or statistical tests confirming that accuracies exceed chance. These omissions are load-bearing because the headline result cannot be verified or replicated without them.
Authors: We accept the criticism. The revised abstract now indicates that an 80/20 user-stratified split was used to avoid leakage, that the random baseline reflects the majority-class proportion, and that improvements were assessed with statistical tests against chance. These details were already present in the methods and results sections; the abstract has been expanded to make them visible at a glance. revision: yes
- Test-retest reliability or behavioral validation of the personality labels beyond standard BFI self-report administration
Circularity Check
No circularity: empirical supervised inference on collected logs and external labels
full rationale
The paper's core result is obtained by collecting 62,090 real ChatGPT logs from 668 participants, obtaining separate personality trait labels (presumably via standard self-report instruments), fine-tuning RoBERTa-base classifiers on the chat text, and measuring ternary classification accuracy against held-out data. Reported lifts such as +44% relative improvement for extraversion are direct performance metrics on this pipeline. No step redefines the target labels in terms of the model's outputs, renames a fitted parameter as a prediction, or relies on a self-citation chain to establish uniqueness or an ansatz. The derivation is therefore self-contained and externally falsifiable.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math RoBERTa-base can be fine-tuned for ternary text classification of personality traits from chat logs
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We fine-tuned RoBERTa-base text classification models to infer personality traits from CA interactions... accuracy (ternary classification) better than random
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We collected actual ChatGPT logs from N=668 participants, containing 62,090 individual chats
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Inferential Privacy Leakage in Anonymized Conversational AI Logs
LLM-based inference recovers user age, gender, and country from filtered ChatGPT logs at weighted F1 scores of 0.84-0.90, with median identification from the first 5% of history, driven by stereotype patterns.
Reference graph
Works this paper leans on
-
[1]
Alkı¸ s, N., Ta¸ skaya Temizel, T.: The impact of individual differences on influence strategies. Personality and Individ- ual Differences87, 147–152 (Dec 2015).https://doi.org/10.1016/j.paid.2015.07.037,https: //www.sciencedirect.com/science/article/pii/S0191886915004948
-
[2]
Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., Rieck, K.: Dos and Don’ts of Machine Learning in Computer Security. In: Proc. of the USENIX Security Symp. USENIX As- sociation (2020).https://doi.org/https://doi.org/10.48550/arXiv.2010.09470, zotAbbrevi- ate:no proceedingsTitle
-
[3]
Chatterji, A., Cunningham, T., Deming, D.J., Hitzig, Z., Ong, C., Shan, C.Y ., Wadman, K.: How people use chatgpt. Working Paper 34255, National Bureau of Economic Research (2025).https://doi.org/10.3386/w34255, http://www.nber.org/papers/w34255
-
[4]
Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub
Chorley, M.J., Whitaker, R.M., Allen, S.M.: Personality and location-based social networks. Computers in Hu- man Behavior (2015).https://doi.org/10.1016/j.chb.2014.12.038,https://linkinghub. elsevier.com/retrieve/pii/S0747563214007559
-
[5]
Cobb-Clark, D.A., Schurer, S.: The stability of big-five personality traits. Economics Letters (1) (2012). https://doi.org/10.1016/j.econlet.2011.11.015,https://linkinghub.elsevier. com/retrieve/pii/S0165176511004666
-
[6]
koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection
Dardikman, I.: 8 Million Users’ AI Conversations Sold for Profit by "Privacy" Extensions (2025),https://www. koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection
work page 2025
-
[7]
Duarte, F.: Number of ChatGPT Users (January 2026) (2026),https://explodingtopics.com/blog/ chatgpt-users
work page 2026
-
[8]
Duhigg, C.: What Does Your Credit-Card Company Know About You? The New York Times (May 2009),https: //www.nytimes.com/2009/05/17/magazine/17credit-t.html 14 D. Cögendez et al
work page 2009
-
[9]
Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021
El-Demerdash, K., El-Khoribi, R.A., Ismail Shoman, M.A., Abdou, S.: Deep learning based fusion strategies for personality prediction. Egyptian Informatics Journal (1) (2022).https://doi.org/10.1016/j.eij.2021. 05.004,https://linkinghub.elsevier.com/retrieve/pii/S1110866521000311
-
[10]
Nature (2018), https://www.nature.com/articles/d41586-018-03880-4
Gibney, E.: The scant science behind Cambridge Analytica’s controversial marketing techniques. Nature (2018), https://www.nature.com/articles/d41586-018-03880-4
work page 2018
-
[11]
Gosling, S.D., Rentfrow, P.J., Swann, W.B.: A very brief measure of the Big-Five personality domains. Journal of Research in Personality (6) (2003).https://doi.org/10.1016/S0092-6566(03)00046-1,https: //linkinghub.elsevier.com/retrieve/pii/S0092656603000461
-
[12]
In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS)
Hassanein, M., Rady, S., Hussein, W., Gharib, T.F.: Predicting the Big Five for social network users using their personality characteristics. In: 2021 Tenth International Conference on Intelligent Computing and Information Sys- tems (ICICIS). pp. 160–164 (Dec 2021).https://doi.org/10.1109/ICICIS52592.2021.9694160, https://ieeexplore.ieee.org/document/9694160
-
[13]
Hirsh, J.B., Kang, S.K., Bodenhausen, G.V .: Personalized Persuasion. Psychological Science (2012).https:// doi.org/10.1177/0956797611436349,https://sci-hub.st/10.1177/0956797611436349
-
[14]
Kosinski, M., Bachrach, Y ., Kohli, P., Stillwell, D., Graepel, T.: Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning (3) (2014).https://doi.org/10.1007/ s10994-013-5415-y,http://link.springer.com/10.1007/s10994-013-5415-y
-
[15]
Lima, A.C.E., de Castro, L.N.: A multi-label, semi-supervised classification approach applied to personality pre- diction in social media. Neural Networks (2014).https://doi.org/10.1016/j.neunet.2014.05.020, https://linkinghub.elsevier.com/retrieve/pii/S0893608014001282
-
[16]
Malki, L.M., Polamarasetty, A., Hatamian, M., Costanza, E., Warner, M.: “Hoovered up as a data point”: Exploring Privacy Behaviours, Awareness, and Concerns Among UK Users of LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)
work page 2025
-
[17]
Journal of Personality Assessment (1) (2019).https://doi.org/10
Maples-Keller, J.L., Williamson, R.L., Sleep, C.E., Carter, N.T., Campbell, W.K., Miller, J.D.: Using Item Re- sponse Theory to Develop a 60-Item Representation of the NEO PI–R Using the International Personality Item Pool: Development of the IPIP–NEO–60. Journal of Personality Assessment (1) (2019).https://doi.org/10. 1080/00223891.2017.1381968,https://d...
-
[18]
Matz, S.C., Teeny, J.D., Vaid, S.S., Peters, H., Harari, G.M., Cerf, M.: The potential of generative AI for personalized persuasion at scale. Scientific Reports (1) (2024).https://doi.org/10.1038/s41598-024-53755-0, https://www.nature.com/articles/s41598-024-53755-0
-
[19]
Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05
McCrae, R.R., Costa, Jr., P.T., Martin, T.A.: The NEO–PI–3: A More Readable Revised NEO Personality Inventory. Journal of Personality Assessment (3) (2005).https://doi.org/10.1207/s15327752jpa8403_05
-
[20]
McMahon, L.: Hundreds of thousands of Grok chats exposed in Google results (2025),https://www.bbc.com/ news/articles/cdrkmk00jy0o
work page 2025
-
[21]
Minamikawa, A., Yokoyama, H.: Blog tells what kind of personality you have: egogram estimation from Japanese weblog. CSCW (2011)
work page 2011
-
[22]
In: Natural Language Processing and Information Systems
Molchanova, M.: Exploring the Potential of Large Language Models for Text-Based Personality Prediction. In: Natural Language Processing and Information Systems. Springer Nature Switzerland, Cham (2024),https: //link.springer.com/10.1007/978-3-031-70242-6_28
-
[23]
Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www
Mønsted, B., Mollgaard, A., Mathiesen, J.: Phone-based metric as a predictor for basic personality traits. Journal of Research in Personality (2018).https://doi.org/10.1016/j.jrp.2017.12.004,https://www. sciencedirect.com/science/article/pii/S0092656618300011
-
[24]
PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic
Peters, H., Matz, S.C.: Large language models can infer psychological dispositions of social media users. PNAS Nexus (6) (2024).https://doi.org/10.1093/pnasnexus/pgae231,https://academic. oup.com/pnasnexus/article/doi/10.1093/pnasnexus/pgae231/7692212
-
[25]
Rentfrow, P.J., Jokela, M., Lamb, M.E.: Regional Personality Differences in Great Britain. PLOS ONE (3) (2015).https://doi.org/10.1371/journal.pone.0122245,https://journals.plos.org/ plosone/article?id=10.1371/journal.pone.0122245
-
[26]
Roccas, S., Sagiv, L., Schwartz, S.H., Knafo, A.: The Big Five Personality Factors and Personal Values. Personality and Social Psychology Bulletin (6) (2002).https://doi.org/10.1177/0146167202289008,https: //doi.org/10.1177/0146167202289008
-
[27]
Schmitt, D., McCrae, R., Benet, V ., Alcalay, L., Ault, L., Austers, I., Bianchi, G., Boholst, F., Cunen, M., Braeckman, J., Jr, E., Caral, L.G., Caron, G., Casullo, M., Cunningham, M., Daibo, I., de backer, C., Zupanèiè, A.: The geographic distribution of big five personality traits: Patterns and profiles of human self-description across 56 nations. Jour...
-
[28]
Simo, F.: Our approach to advertising and expanding access to chatgpt (2026),https://openai.com/index/ our-approach-to-advertising-and-expanding-access/
work page 2026
-
[29]
Sims, D.: Thousands of private ChatGPT conversations found via Google search after feature mishap (2025),https://www.techspot.com/news/ 108911-thousands-private-chatgpt-conversations-found-google-search-after. html
work page 2025
-
[30]
Staab, R., Vero, M., Balunovic, M., Vechev, M.: Beyond Memorization: Violating Privacy via Inference with Large Language Models (2023),https://openreview.net/forum?id=kmn0BhQk7p
work page 2023
-
[31]
Stachl, C., Au, Q., Schoedel, R., Gosling, S.D., Harari, G.M., Buschek, D., Völkel, S.T., Schuwerk, T., Oldemeier, M., Ullmann, T., Hussmann, H., Bischl, B., Bühner, M.: Predicting personality from patterns of behavior collected with smartphones. Proceedings of the National Academy of Sciences (30) (2020).https://doi.org/10.1073/ pnas.1920484117,https://w...
work page 2020
-
[32]
Weston, S.J., Edmonds, G.W., Hill, P.L.: Personality traits predict dietary habits in middle-to-older adults. Psychol- ogy, Health & Medicine (3) (2020).https://doi.org/10.1080/13548506.2019.1687918,https: //www.tandfonline.com/doi/full/10.1080/13548506.2019.1687918
-
[33]
Westwood, S.J., Druckman, E.J.N., Levi, M.: The potential existential threat of large language models to online survey research (2025)
work page 2025
-
[34]
Wright, A.G.C., Ringwald, W.R., Vize, C.E., Eichstaedt, J.C., Angstadt, M., Taxali, A., Sripada, C.: Assess- ing personality using zero-shot generative AI scoring of brief open-ended text. Nature Human Behaviour pp. 1– 15 (Jan 2026).https://doi.org/10.1038/s41562-025-02389-x,https://www.nature.com/ articles/s41562-025-02389-x
-
[35]
Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Gao, C., Huang, C., Lv, C., Zheng, C., Liu, D., Zhou, F., Huang, F., Hu, F., Ge, H., Wei, H., Lin, H., Tang, J., Yang, J., Tu, J., Zhang, J., Yang, J., Yang, J., Zhou, J., Zhou, J., Lin, J., Dang, K., Bao, K., Yang, K., Yu, L., Deng, L., Li, M., Xue, M., Li, M., Zhang, P., Wang, P., Zhu, Q...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[36]
Zhu, Y ., Hu, L., Ning, N., Zhang, W., Wu, B.: A lexical psycholinguistic knowledge-guided graph neural network for interpretable personality detection (2022)
work page 2022
-
[37]
In: Proceedings of the 20th Chinese National Conference on Computational Linguistics
Zhuang, L., Wayne, L., Ya, S., Jun, Z.: A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics. Chinese Information Processing Society of China, Huhhot, China (2021),https://aclanthology.org/2021.ccl-1.108/
work page 2021
-
[38]
Zufferey, N., Gaballah, S.A., Marky, K., Zimmermann, V .: “AI is from the devil.” Behaviors and Concerns Toward Personal Data Sharing with LLM-based Conversational Agents. Proceedings on Privacy Enhancing Technologies (2025)
work page 2025
-
[39]
Zufferey, N., Humbert, M., Tavenard, R., Huguenin, K.: Watch your Watch: Inferring Personality Traits from Wearable Activity Trackers. In: Proc. of the USENIX Security Symp. USENIX Association (2023),https: //www.usenix.org/conference/usenixsecurity23/presentation/zufferey 16 D. Cögendez et al. A Dataset A.1 User Chats total mean median std min max 62090 ...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.