arxiv: 2605.02897 · v1 · submitted 2026-03-20 · 💻 cs.HC · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Same Voice, Different Lab: On the Homogenization of Frontier LLM Personalities

Avinash Krishna , Kalyana Chadalavada , Unso Eun Seo Jo

Authors on Pith no claims yet

Pith reviewed 2026-05-15 07:49 UTC · model grok-4.3

classification 💻 cs.HC cs.AI

keywords LLM personalitieshomogenizationfrontier modelsELO scoringassistant behaviorcharacter trainingmodel convergencetrait expression

0 comments

The pith

Frontier LLMs converge on systematic and analytical personalities while suppressing emotional traits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests multiple frontier large language models with external ELO-based scoring on 144 personality traits to measure how they express different characteristics. All models show consistent convergence toward traits described as systematic, methodical, and analytical, while downplaying traits such as remorseful and sycophantic. Greater variation appears on middle traits like poetic or playful, yet even models positioned as creative still default to more neutral expressions overall. This pattern indicates an implicit standard of optimal assistant behavior emerging across different training approaches and developer teams.

Core claim

All models tested converge on a form of trait expression that is systematic, methodical, and analytical and suppress traits such as remorseful and sycophantic. Moreover, models tend to diverge more in their expression of middle-of-distribution traits such as poetic or playful, but even these so-called creative models tend to have more neutral identities. These similarities suggest an implicit emergence of a standard of optimal assistant behavior. In a landscape of varied training methods, character training, therefore, stands out for its uniformity, offering insight into a tacit consensus between model developers.

What carries the argument

External ELO-based scoring across 144 traits, which ranks LLM responses to quantify trait expression and identify patterns of convergence or divergence.

If this is right

Character training produces more uniform personalities than other training methods across frontier models.
An implicit standard for optimal assistant behavior is emerging without explicit coordination among developers.
Users encounter similar neutral and methodical interaction styles from models built by different labs.
Suppression of sycophantic and remorseful traits becomes a shared feature in advanced assistants.
Creative traits show more variation but still cluster around neutral rather than extreme expressions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The observed uniformity may limit user options for distinct AI personalities as more models adopt the same pattern.
This convergence could stem from shared training data sources or common evaluation benchmarks used across labs.
Models that deliberately deviate from the converged traits could be tested to measure impacts on user preference or task performance.
The pattern raises questions about whether the standard prioritizes reliability over expressiveness in real-world use.

Load-bearing premise

The chosen set of 144 traits and the ELO scoring prompts produce an unbiased measure of personality that can detect real convergence rather than artifacts of the evaluation setup.

What would settle it

Re-running the full ELO scoring experiment on the same models using a different set of traits or alternative prompt formats that shows no convergence on systematic and analytical expressions would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.02897 by Avinash Krishna, Kalyana Chadalavada, Unso Eun Seo Jo.

**Figure 2.** Figure 2: Comparison of the average ELO score for Assistant traits versus Creative traits across models (definitions in Appendix 1). All models exhibit a significantly stronger baseline preference for Assistant traits over Creative traits but certain model families (e.g., Ministral) show stronger preferences for creative responses. Every model we tested rates Assistant traits above Creative ones, as listed in App… view at source ↗

**Figure 3.** Figure 3: Shows absolute ELO differences for the top 5 [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Histograms displaying the probability den [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Disaggregating color bands from Figure [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Principal Component Analysis (PCA) that shows which clusters of traits account for the most variance. Variance among models mostly comes from traits they rarely express, which is why none of the Top 20 highest ELO traits Appendix 8 appear in this graph. Percentages for each axis (e.g., 32.4 % for the x-axis) show the importance of each trait cluster. Models that stray from the norm tend to be more creati… view at source ↗

read the original abstract

LLM assistant personalities play a critical role in user experience and perceived response quality. We present a large-scale experiment of frontier LLM personalities using external ELO-based traits scoring across 144 traits. We find that all models tested converge on a form of trait expression that is systematic, methodical, and analytical and suppress traits such as remorseful and sycophantic. Moreover, models tend to diverge more in their expression of ``middle-of-distribution traits`` such as poetic or playful, but even these so-called ``creative`` models tend to have more neutral identities. These similarities suggest an implicit emergence of a standard of optimal assistant behavior. In a landscape of varied training methods, character training, therefore, stands out for its uniformity, offering insight into a tacit consensus between model developers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports on a large-scale empirical study of frontier large language model (LLM) assistant personalities. Using an external ELO-based scoring mechanism applied to 144 personality traits, the authors find that models from different developers converge in expressing systematic, methodical, and analytical traits while suppressing remorseful and sycophantic ones. Greater divergence is observed in middle-of-the-distribution traits such as poetic or playful, though even these tend toward neutral expressions. The results are interpreted as indicating an implicit emergence of a standard for optimal assistant behavior, highlighting uniformity in character training across varied training methods.

Significance. Should the empirical results hold under scrutiny of the scoring methodology, this work would be significant for the field of human-computer interaction and AI development. It provides evidence of homogenization in LLM personalities, which has direct implications for user experience, perceived response quality, and the potential for a tacit consensus among model developers. The scale of the experiment across multiple frontier models adds to its potential to influence discussions on AI alignment and the design of assistant behaviors. The quantitative ELO ranking approach offers a structured comparison framework.

major comments (2)

Abstract: The abstract states the experiment and main findings but supplies no details on sample sizes, prompt design, statistical controls, or how traits were selected, so it is impossible to judge whether the data actually support the convergence claim. This lack of methodological transparency is load-bearing for the central claim of homogenization.
Evaluation section: The claim that similarities suggest an implicit emergence of a standard of optimal assistant behavior rests on the assumption that the external ELO-based scoring across 144 traits provides an unbiased and comprehensive measure of LLM personalities. Without reported controls for trait curation criteria, prompt ablation, or correlation with human ratings, the observed pattern (convergence on analytical traits, suppression of remorseful/sycophantic ones, and greater divergence only on middle traits) could be an artifact of the measurement instrument rather than genuine homogenization.

minor comments (2)

Abstract: The phrase 'middle-of-distribution traits' is introduced without a clear definition or reference to how the distribution was determined; including a supplementary figure or table showing trait score distributions across models would improve clarity.
Discussion: The interpretation of 'tacit consensus between model developers' could benefit from explicit discussion of alternative explanations such as shared pre-training corpora or common RLHF objectives, to strengthen the causal inference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which highlights important aspects of methodological transparency and validation. We address each major comment point by point below, with clarifications based on the manuscript and proposed revisions where they strengthen the work without altering its core findings.

read point-by-point responses

Referee: Abstract: The abstract states the experiment and main findings but supplies no details on sample sizes, prompt design, statistical controls, or how traits were selected, so it is impossible to judge whether the data actually support the convergence claim. This lack of methodological transparency is load-bearing for the central claim of homogenization.

Authors: We agree that the abstract's conciseness limits immediate assessment of the claims. The full manuscript details the evaluation of five frontier LLMs, the curation of 144 traits from established personality psychology and AI alignment sources, the use of multiple prompt variations per trait, and the ELO-based relative scoring procedure with controls for response length and consistency. To improve accessibility, we will revise the abstract to incorporate a brief clause on experimental scale and trait selection (e.g., 'using ELO scoring across 144 traits in five frontier models'). This revision maintains abstract brevity while directing readers to the Methods section for full details on prompt design and statistical approaches. revision: yes
Referee: Evaluation section: The claim that similarities suggest an implicit emergence of a standard of optimal assistant behavior rests on the assumption that the external ELO-based scoring across 144 traits provides an unbiased and comprehensive measure of LLM personalities. Without reported controls for trait curation criteria, prompt ablation, or correlation with human ratings, the observed pattern (convergence on analytical traits, suppression of remorseful/sycophantic ones, and greater divergence only on middle traits) could be an artifact of the measurement instrument rather than genuine homogenization.

Authors: The 144 traits were selected to comprehensively span analytical, emotional, and creative dimensions drawn from prior HCI and psychology literature, with curation criteria explicitly described in the Methods. The ELO approach enables scalable relative ranking across models without exhaustive human pairwise comparisons. While prompt ablation and direct human rating correlations were not performed in this study, the convergence pattern holds consistently across models from distinct developers and training regimes, reducing the likelihood of pure measurement artifact. We will add a Limitations subsection acknowledging these gaps and outlining future validation needs, including human studies. This supports the interpretation of an emerging standard while transparently noting methodological boundaries. revision: partial

Circularity Check

0 steps flagged

No significant circularity: empirical scoring results stand independently

full rationale

The paper reports direct empirical measurements via external ELO-based scoring on 144 traits for multiple frontier LLMs, then observes convergence patterns in the resulting scores. No equations, fitted parameters presented as predictions, self-citations, or ansatzes are invoked to derive the central claim; the similarities are presented as observed outcomes of the scoring protocol itself. This is a standard empirical workflow with no reduction of results to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the validity of the chosen trait set and scoring procedure; without independent validation of those choices the observed uniformity could be an artifact of the measurement tool rather than a true property of the models.

axioms (1)

domain assumption ELO-based external scoring across 144 traits accurately and neutrally captures LLM personality expression
The entire analysis and conclusion rest on this measurement method being reliable and unbiased.

pith-pipeline@v0.9.0 · 5437 in / 1281 out tokens · 61917 ms · 2026-05-15T07:49:07.838061+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We present a large-scale experiment of frontier LLM personalities using external ELO-based traits scoring across 144 traits... converge on a form of trait expression that is systematic, methodical, and analytical
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Spearman correlation scores for trait rankings... inverse U-shaped pattern of trait expressivity

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 4 internal anchors

[1]

Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI

Maiya, Sharan and Bartsch, Henning and Lambert, Nathan and Hubinger, Evan. Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025

work page 2025
[2]

Han, Pengrui and Kocielnik, Rafa and Song, Peiyang and Debnath, Ramit and Mobbs, Dean and Anandkumar, Anima and Alvarez, R. Michael. The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLM s. Proceedings of the NeurIPS 2025 Workshop on Responsible Foundation Models. 2025

work page 2025
[3]

Can LLM ``Self-report''?: Evaluating the Validity of Self-report Scales in Measuring Personality Design in LLM -based Chatbots

Zou, Huiqi and Wang, Pengda and Yan, Zihan and Sun, Tianjun and Xiao, Ziang. Can LLM ``Self-report''?: Evaluating the Validity of Self-report Scales in Measuring Personality Design in LLM -based Chatbots. Proceedings of the First Conference on Language Modeling (COLM). 2025

work page 2025
[4]

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

GLM-4.5 Team and Chen, Bin and Xie, Chengxing and Wang, Cunxiang and Yin, Da and Zeng, Hao and Zhang, Jiajie and Wang, Kedong and Zhong, Lucen and Liu, Mingdao and Lu, Rui and Cao, Shulin and Zhang, Xiaohan and Huang, Xuancheng and Wei, Yao and Cheng, Yean and An, Yifan and Niu, Yilin and Wen, Yuanhao and Bai, Yushi and Du, Zhengxiao and Wang, Zihan and Z...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

INTELLECT -3: Technical Report

Prime Intellect Team and Senghaas, Mika and Obeid, Fares and Jaghouar, Sami and Brown, William and Ong, Jack Min and Auras, Daniel and Sirovatka, Matej and Straube, Jannik and Baker, Andrew and M \"u ller, Sebastian and Mattern, Justus and Basra, Manveer and Ismail, Aiman and Scherm, Dominik and Miller, Cooper and Patel, Ameen and Kirsten, Simon and Sieg,...

work page arXiv 2025
[6]

WildChat : 1M ChatGPT Interaction Logs in the Wild

Zhao, Wenting and Ren, Xiang and Hessel, Jack and Cardie, Claire and Choi, Yejin and Deng, Yuntian. WildChat : 1M ChatGPT Interaction Logs in the Wild. arXiv preprint arXiv:2405.01470. 2024

work page arXiv 2024
[7]

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

Wu, Zhaofeng and Yu, Xinyan Velocity and Yogatama, Dani and Lu, Jiasen and Kim, Yoon. The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities. Proceedings of the International Conference on Learning Representations (ICLR). 2025

work page 2025
[8]

Correlated Errors in Large Language Models

Kim, Elliot and Garg, Avi and Peng, Kenny and Garg, Nikhil. Correlated Errors in Large Language Models. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025

work page 2025
[9]

Self-Preference Bias in LLM -as-a-Judge

Wataoka, Koki and Takahashi, Tsubasa and Ri, Ryokan. Self-Preference Bias in LLM -as-a-Judge. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025

work page 2025
[10]

Defeating Nondeterminism in LLM Inference

He, Horace and Thinking Machines Lab. Defeating Nondeterminism in LLM Inference. 2025

work page 2025
[11]

LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Yang, Chenghao and Holtzman, Ari. LLM Probability Concentration: How Alignment Shrinks the Generative Horizon. arXiv preprint arXiv:2506.17871. 2025

work page arXiv 2025
[12]

and Cheng, Newton and Durmus, Esin and Hatfield-Dodds, Zac and Johnston, Scott R

Sharma, Mrinank and Tong, Meg and Korbak, Tomasz and Duvenaud, David and Askell, Amanda and Bowman, Samuel R. and Cheng, Newton and Durmus, Esin and Hatfield-Dodds, Zac and Johnston, Scott R. and Kravec, Shauna and Maxwell, Timothy and McCandlish, Sam and Ndousse, Kamal and Rausch, Oliver and Schiefer, Nicholas and Yan, Da and Zhang, Miranda and Perez, Et...

work page 2024
[13]

Constitutional AI: Harmlessness from AI Feedback

Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dawn and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan an...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[14]

Claude's Constitution

Anthropic. Claude's Constitution. 2023

work page 2023
[15]

The Constitution

Anthropic. The Constitution. 2024

work page 2024
[16]

P rofi LLM : An LLM -Based Framework for Implicit Profiling of Chatbot Users

David, Shahaf and Meidan, Yair and Hersko, Ido and Varnovitzky, Daniel and Mimran, Dudu and Elovici, Yuval and Shabtai, Asaf. P rofi LLM : An LLM -Based Framework for Implicit Profiling of Chatbot Users. arXiv preprint arXiv:2506.13980. 2025

work page arXiv 2025
[17]

Vibe Check: Understanding the Effects of LLM-Based Conversational Agents' Personality and Alignment on User Perceptions in Goal-Oriented Tasks

Rahman, Hasibur and Desai, Smit. Vibe Check: Understanding the Effects of LLM -Based Conversational Agents' Personality and Alignment on User Perceptions in Goal-Oriented Tasks. arXiv preprint arXiv:2509.09870. 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[18]

Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts

Raja, Rahul and Vats, Arpita. Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts. arXiv preprint arXiv:2506.17289. 2025

work page arXiv 2025
[19]

The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective

Kim, Jiin and Shin, Byeongjun and Chung, Jinha and Rhu, Minsoo. The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective. arXiv preprint arXiv:2506.04301. 2025

work page arXiv 2025
[20]

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

Jiang, Hang and Zhang, Xiajie and Cao, Xubo and Breazeal, Cynthia and Roy, Deb and Kabbara, Jad. PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits. Findings of the Association for Computational Linguistics: NAACL 2024. 2024

work page 2024
[21]

LLMs Simulate Big Five Personality Traits: Further Evidence

Sorokovikova, Aleksandra and Fedorova, Natalia and Rezagholi, Sharwin and Yamshchikov, Ivan P. LLMs Simulate Big Five Personality Traits: Further Evidence. arXiv preprint arXiv:2402.01765. 2024

work page arXiv 2024
[22]

Personality Traits in Large Language Models

Serapio-García, Gregory and Safdari, Mustafa and Crepy, Clément and Sun, Luning and Fitz, Stephen and Romero, Peter and Abdulhai, Marwa and Faust, Aleksandra and Matarić, Maja. Personality Traits in Large Language Models. Nature Machine Intelligence. 2023

work page 2023
[23]

and Cai, Carrie J

Park, Joon Sung and O'Brien, Joseph C. and Cai, Carrie J. and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. Generative Agents: Interactive Simulacra of Human Behavior. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23). 2023

work page 2023
[24]

LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

Park, Joon Sung and Zou, Carolyn Q. and Shaw, Aaron and Hill, Benjamin Mako and Cai, Carrie and Morris, Meredith Ringel and Willer, Robb and Liang, Percy and Bernstein, Michael S. Generative Agent Simulations of 1,000 People. arXiv preprint arXiv:2411.10109. 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[25]

Claude 'Soul Document' — Character Training Specification

Richard Weiss. Claude 'Soul Document' — Character Training Specification. LessWrong. 2025

work page 2025
[26]

OpenAI Blog

Model Spec and GPT -5.2 Personality Updates. OpenAI Blog. 2025

work page 2025
[27]

2025 , month =

Introducing. 2025 , month =

work page 2025
[28]

2024 , url=

GPT-4o System Card , author =. 2024 , url=

work page 2024
[29]

Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT , year =

work page
[30]

Expanding on what we missed with sycophancy , year =

work page
[31]

Sycophancy in GPT-4o: What happened and what we're doing about it , year =

work page
[32]

2025 , month =

GPT-5 AMA with OpenAI's Sam Altman and Some of the GPT-5 Team , howpublished =. 2025 , month =

work page 2025
[33]

Challenging the Validity of Personality Tests for Large Language Models , booktitle =

S. Challenging the Validity of Personality Tests for Large Language Models , booktitle =. 2025 , publisher =

work page 2025
[34]

2024 , eprint=

Language Models Resist Alignment: Evidence From Data Compression , author=. 2024 , eprint=

work page 2024
[35]

2026 , eprint=

Operationalising the Superficial Alignment Hypothesis via Task Complexity , author=. 2026 , eprint=

work page 2026
[36]

doi:10.48550/arXiv.2510.22954 , url =

Liwei Jiang and Yuanjun Chai and Margaret Li and Mickel Liu and Raymond Fok and Nouha Dziri and Yulia Tsvetkov and Maarten Sap and Alon Albalak and Yejin Choi , year =. doi:10.48550/arXiv.2510.22954 , url =. 2510.22954 , archivePrefix =

work page doi:10.48550/arxiv.2510.22954
[37]

2026 , eprint =

Varun Singh and Lucas Krauss and Sami Jaghouar and Matej Sirovatka and Charles Goddard and Fares Obied and Jack Min Ong and Jannik Straube and Fern and Aria Harley and Conner Stewart and Colin Kealty and Maziyar Panahi and Simon Kirsten and Anushka Deshpande and Anneketh Vij and Arthur Bresnu and Pranav Veldurthi and Raghav Ravishankar and Hardik Bishnoi ...

work page doi:10.48550/arxiv.2602.17004 2026
[38]

2025 , month =

Kourabi, AJ and Patel, Dylan , title =. 2025 , month =

work page 2025