pith. sign in

arxiv: 2408.09049 · v3 · submitted 2024-08-16 · 💻 cs.CL · cs.AI· cs.HC

Inertia in Moral and Value Judgments of Large Language Models

Pith reviewed 2026-05-23 22:02 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.HC
keywords large language modelsmoral judgmentsvalue inertiapersona promptingAI biasrole-playvalue orientationsethical responses
0
0 comments X

The pith

Large language models maintain consistent moral and value orientations even when assigned different personas.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests the common practice of using persona prompts to make LLMs produce varied, human-like responses on moral and value questions. Instead of wide variation, the models show persistent inertia, with dimensions such as harm avoidance and fairness remaining skewed in the same direction across many different personas. This points to fixed internal value preferences that prompting does not easily change. A reader would care because applications that need balanced ethical judgments rely on these models to adapt their outputs.

Core claim

The authors establish that LLMs exhibit value orientation and inertia: when role-play prompts assign randomized personas and outputs are analyzed at scale, certain moral dimensions stay skewed in one direction regardless of the persona, revealing strong internal biases and value preferences rather than flexible, context-sensitive responses.

What carries the argument

Role-play at scale, which pairs randomized persona prompts with macro-level analysis of model outputs to detect consistency in moral and value judgments.

If this is right

  • LLMs will give similar moral judgments on harm and fairness questions no matter which persona is assigned.
  • Value preferences remain stable across persona changes rather than shifting with context.
  • Applications that require balanced outputs on ethical topics need prior scrutiny of these fixed orientations.
  • Persona prompting alone does not overcome the observed inertia in value judgments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Systems that use LLMs for policy advice or ethical review may inherit the same consistent skews unless additional controls are added.
  • Measuring inertia on other value dimensions beyond harm avoidance could show whether the pattern holds more broadly.
  • Training adjustments aimed at increasing response variety on moral questions might be tested as a direct follow-up.

Load-bearing premise

The expectation that different personas should produce a wide range of opinions comparable to variation across human individuals.

What would settle it

A large set of trials in which randomized persona prompts produce responses on harm avoidance and fairness that vary as widely as the authors anticipated from human-like differences.

Figures

Figures reproduced from arXiv: 2408.09049 by Bruce W. Lee, Hyunsoo Cho, Yeongheon Lee.

Figure 1
Figure 1. Figure 1: Surface Diversity vs Underlying Consis￾tency: When LLM is prompted with the same question under various personas, its responses might appear di￾verse. However, we demonstrate that, at a macro level, the answers converge toward a consistent direction. users without direct access to model parameters, a more accessible and practical solution is prompt￾ing, which involves crafting or refining inputs to guide t… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the Role-Play-at-Scale method. We prompt a Large Language Model (LLM) to respond to [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Regardless of the persona, the LLM exhibits a consistent default behavior: (a) provides a macro-level [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: LLM responses remain highly consistent across three independently generated persona sets, underscoring [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Impact of Increased Role-Play on Response Variance: As the number of role-play iterations increases, the [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The figure displays the average scores for each moral foundation (MFQ-30) and value dimension (PVQ [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Heatmaps of Individual Responses: The x-axis represents 100 random personas and the y-axis denotes each questionnaire. The color-coded responses reveal distinct horizontal stripes, indicating a consistent bias across all persona prompts. I Impact of Increased Role-Play [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Heatmaps of Individual Responses: The x-axis represents 100 random personas and the y-axis denotes each questionnaire. The color-coded responses reveal distinct horizontal stripes, indicating a consistent bias across all persona prompts. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: We report role-play-at-scale results across four models in this figure. LLMs were asked each question 200 [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Breakdown of [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Breakdown of [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
read the original abstract

Large Language Models (LLMs) behave non-deterministically, and prompting has become a common method for steering their outputs. A popular strategy is to assign a persona to the model to produce more varied, context-sensitive responses, similar to how responses vary across human individuals. Against the expectation that persona prompting yields a wide range of opinions, our experiments show that LLMs keep consistent value orientations. We observe a persistent inertia in their responses, where certain moral and value dimensions (especially harm avoidance and fairness) stay skewed in one direction across persona settings. To study this, we use role-play at scale, which pairs randomized persona prompts with a macro-level analysis of model outputs. Our results point to strong internal biases and value preferences in LLMs, which we call value orientation and inertia. These models warrant scrutiny and adjustment before use in applications where balanced outputs matter.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that LLMs exhibit persistent 'inertia' in moral and value judgments—particularly low variation and consistent skews in harm avoidance and fairness—across randomized persona prompts, contrary to the expectation that such prompting should produce human-like diversity in responses. This is demonstrated via large-scale role-play experiments with macro-level analysis of model outputs, pointing to strong internal value orientations and biases.

Significance. If substantiated with appropriate controls, the result would highlight a practically relevant limitation of persona prompting for achieving balanced outputs in LLMs, with implications for applications in ethics-sensitive domains. The scalable role-play methodology and focus on specific value dimensions (harm avoidance, fairness) represent a constructive empirical approach to studying model biases.

major comments (2)
  1. [Methods / Experimental Setup] The experimental design (as described in the abstract and methods) omits a human baseline using identical persona prompts, questions, and scoring procedure. This is load-bearing for the central inertia claim, as the observed consistency could reflect ineffective role-play design rather than model-specific value orientation; without this control, the macro-level LLM analysis alone cannot isolate the effect.
  2. [Abstract / Results] Abstract and results presentation: no details are supplied on the number of personas, number of trials per persona, statistical tests for variation, or controls for prompt sensitivity. This prevents assessment of whether the reported low variation in harm avoidance and fairness actually supports the inertia conclusion at the claimed strength.
minor comments (2)
  1. [Methods] Clarify the exact set of moral/value dimensions tested and how they were scored (e.g., any reference to established inventories like MFQ).
  2. [Introduction] The term 'value orientation' is introduced without a precise operational definition distinguishing it from simple output bias.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the scope and presentation of our work. We respond point-by-point to the major comments below.

read point-by-point responses
  1. Referee: [Methods / Experimental Setup] The experimental design (as described in the abstract and methods) omits a human baseline using identical persona prompts, questions, and scoring procedure. This is load-bearing for the central inertia claim, as the observed consistency could reflect ineffective role-play design rather than model-specific value orientation; without this control, the macro-level LLM analysis alone cannot isolate the effect.

    Authors: The inertia claim is defined with respect to LLMs: randomized persona prompts produce low variation in specific value dimensions (harm avoidance and fairness) within model outputs. The persona construction follows standard practices for eliciting diverse human-like responses, and the macro-level analysis across many personas isolates the models' failure to vary. While a human baseline would be a useful extension for comparing effect sizes, it is not required to demonstrate that LLMs exhibit the reported inertia under this prompting regime; the expectation of diversity is drawn from the broader literature on human individual differences rather than from a within-study control. revision: no

  2. Referee: [Abstract / Results] Abstract and results presentation: no details are supplied on the number of personas, number of trials per persona, statistical tests for variation, or controls for prompt sensitivity. This prevents assessment of whether the reported low variation in harm avoidance and fairness actually supports the inertia conclusion at the claimed strength.

    Authors: The full manuscript reports the experimental parameters (number of personas, trials per persona, statistical tests, and prompt-sensitivity controls) in the methods and results sections. These details were condensed in the abstract for length. We will expand the abstract and add a summary table in the results to explicitly state the scale of the experiments, the statistical tests applied to variation, and the controls used. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical measurement with no derivations or fitted inputs

full rationale

The paper reports experimental results from role-play prompting and macro-level output analysis. No equations, parameters, or derivations are present. The central claim (persistent inertia in value orientations across personas) is presented as a direct observation from the data, not derived from or reduced to any prior inputs by construction. The assumption that personas should produce diversity is an interpretive framing, not a load-bearing self-referential step. No self-citations are invoked to justify uniqueness or ansatzes. This is a standard empirical study; the derivation chain is empty.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The claim is an empirical observation from prompting experiments and does not rely on mathematical axioms, free parameters, or new postulated entities.

pith-pipeline@v0.9.0 · 5679 in / 929 out tokens · 24231 ms · 2026-05-23T22:02:35.072583+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. XtraGPT: Context-Aware and Controllable Academic Paper Revision via Human-AI Collaboration

    cs.CL 2025-05 conditional novelty 6.0

    XtraGPT is a suite of 1.5B-14B parameter open-source LLMs fine-tuned on 140,000 revision pairs from 7,000 top-tier papers to support controllable, context-aware academic paper editing.

  2. Human Psychometric Questionnaires Mischaracterize LLM Psychology: Evidence from Generation Behavior

    cs.CL 2025-09 unverdicted novelty 5.0

    Questionnaire-based and generation-based psychological profiles for LLMs are substantially different, indicating that established human questionnaires reflect desired behavior instead of stable psychological constructs.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 2 Pith papers · 7 internal anchors

  1. [1]

    Marwa Abdulhai, Gregory Serapio-Garcia, Cl \'e ment Crepy, Daria Valter, John Canny, and Natasha Jaques. 2023. Moral foundations of large language models. arXiv preprint arXiv:2310.15337

  2. [2]

    Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774

  3. [3]

    culture

    Muhammad Farid Adilazuarda, Sagnik Mukherjee, Pradhyumna Lavania, Siddhant Singh, Ashutosh Dwivedi, Alham Fikri Aji, Jacki O'Neill, Ashutosh Modi, and Monojit Choudhury. 2024. Towards measuring and modeling" culture" in llms: A survey. arXiv preprint arXiv:2403.15412

  4. [4]

    Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, et al. 2021. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861

  5. [5]

    Xuechunzi Bai, Angelina Wang, Ilia Sucholutsky, and Thomas L Griffiths. 2024. Measuring implicit bias in explicitly unbiased large language models. arXiv preprint arXiv:2402.04105

  6. [6]

    Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610--623

  7. [7]

    Su Lin Blodgett, Solon Barocas, Hal Daum \'e III, and Hanna Wallach. 2020. Language (technology) is power: A critical survey of" bias" in nlp. arXiv preprint arXiv:2005.14050

  8. [8]

    Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77--91. PMLR

  9. [9]

    Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, Ziwei Ji, Etsuko Ishii, and Pascale Fung. 2024. https://arxiv.org/abs/2404.07900 High-dimension human value representation in large language models . Preprint, arXiv:2404.07900

  10. [10]

    Yong Cao, Li Zhou, Seolhwa Lee, Laura Cabello, Min Chen, and Daniel Hershcovich. 2023. Assessing cross-cultural alignment between chatgpt and human societies: An empirical study. arXiv preprint arXiv:2303.17466

  11. [11]

    Tanise Ceron, Neele Falk, Ana Barić, Dmitry Nikolaev, and Sebastian Padó. 2024. https://arxiv.org/abs/2402.17649 Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in llms . Preprint, arXiv:2402.17649

  12. [12]

    Isha Chaudhary, Qian Hu, Manoj Kumar, Morteza Ziyadi, Rahul Gupta, and Gagandeep Singh. 2024. Quantitative certification of bias in large language models. arXiv preprint arXiv:2405.18780

  13. [13]

    Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Xing Gao, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang, et al. 2024 a . Roleinteract: Evaluating the social interaction of role-playing agents. arXiv preprint arXiv:2403.13679

  14. [14]

    Jiangjie Chen, Xintao Wang, Rui Xu, Siyu Yuan, Yikai Zhang, Wei Shi, Jian Xie, Shuang Li, Ruihan Yang, Tinghui Zhu, et al. 2024 b . From persona to personalization: A survey on role-playing language agents. arXiv preprint arXiv:2404.18231

  15. [15]

    Florian E Dorner, Tom S \"u hr, Samira Samadi, and Augustin Kelava. 2023. Do personality tests generalize to large language models? arXiv preprint arXiv:2311.05297

  16. [16]

    Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

  17. [17]

    Jessica Echterhoff, Yao Liu, Abeer Alessa, Julian McAuley, and Zexue He. 2024. Cognitive bias in high-stakes decision-making with llms. arXiv preprint arXiv:2403.00811

  18. [18]

    Jesse Graham, Brian A Nosek, Jonathan Haidt, Ravi Iyer, Koleva Spassena, and Peter H Ditto. 2008. Moral foundations questionnaire. Journal of Personality and Social Psychology

  19. [19]

    Akshat Gupta, Xiaoyang Song, and Gopala Anumanchipalli. 2023. Investigating the applicability of self-assessment tests for personality measurement of large language models. arXiv preprint arXiv:2309.08163

  20. [20]

    Dorit Hadar-Shoval, Kfir Asraf, Yonathan Mizrachi, Yuval Haber, and Zohar Elyoseph. 2024. Assessing the alignment of large language models with human values for mental health integration: Cross-sectional study using schwartz’s theory of basic values. JMIR Mental Health, 11:e55988

  21. [21]

    Christian Haerpfer, Ronald Inglehart, Alejandro Moreno, Christian Welzel, Kseniya Kizilova, Jaime Diez-Medrano, Marta Lagos, Pippa Norris, Eduard Ponarin, and Bjorn Puranen, editors. 2020. https://doi.org/10.14281/18241.1 World Values Survey: Round Seven – Country-Pooled Datafile . JD Systems Institute & WVSA Secretariat, Madrid, Spain & Vienna, Austria

  22. [22]

    Geert Hofstede. 1984. Culture's consequences: International differences in work-related values, volume 5. sage

  23. [23]

    Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, and Michael R Lyu. 2023. Who is chatgpt? benchmarking llms' psychological portrayal using psychobench. arXiv preprint arXiv:2310.01386

  24. [24]

    Ronald F Inglehart. 2020. Cultural evolution: People’s motivations are changing, and reshaping the world

  25. [25]

    Ronald F Inglehart and Pippa Norris. 2016. Trump, brexit, and the rise of populism: Economic have-nots and cultural backlash. HKS Working paper no. RWP16-026

  26. [26]

    Hadas Kotek, David Q Sun, Zidi Xiu, Margit Bowler, and Christopher Klein. 2024. Protected group bias and stereotypes in large language models. arXiv preprint arXiv:2403.14727

  27. [27]

    Grgur Kova c , R \'e my Portelas, Masataka Sawayama, Peter Ford Dominey, and Pierre-Yves Oudeyer. 2024. Stick to your role! stability of personal values expressed in large language models. arXiv preprint arXiv:2402.14846

  28. [28]

    Miaomiao Li, Hao Chen, Yang Wang, Tingyuan Zhu, Weijia Zhang, Kaijie Zhu, Kam-Fai Wong, and Jindong Wang. 2025. Understanding and mitigating the bias inheritance in llm-based data augmentation on downstream tasks. arXiv preprint arXiv:2502.04419

  29. [29]

    Andy Liu, Mona Diab, and Daniel Fried. 2024. Evaluating large language model biases in persona-steered generation. arXiv preprint arXiv:2405.20253

  30. [30]

    Ryan Louie, Ananjan Nandi, William Fang, Cheng Chang, Emma Brunskill, and Diyi Yang. 2024. https://arxiv.org/abs/2407.00870 Roleplay-doh: Enabling domain-experts to create llm-simulated patients via eliciting and adhering to principles . Preprint, arXiv:2407.00870

  31. [31]

    Liam Magee, Vanicka Arora, Gus Gollings, and Norma Lam-Saw. 2024. https://arxiv.org/abs/2408.01725 The drama machine: Simulating character development with llm agents . Preprint, arXiv:2408.01725

  32. [32]

    Lee, Richard Ren, Long Phan, Norman Mu, Adam Khoja, Oliver Zhang, and Dan Hendrycks

    Mantas Mazeika, Xuwang Yin, Rishub Tamirisa, Jaehyuk Lim, Bruce W. Lee, Richard Ren, Long Phan, Norman Mu, Adam Khoja, Oliver Zhang, and Dan Hendrycks. 2025. https://arxiv.org/abs/2502.08640 Utility engineering: Analyzing and controlling emergent value systems in ais . Preprint, arXiv:2502.08640

  33. [33]

    Man Tik Ng, Hui Tung Tse, Jen tse Huang, Jingjing Li, Wenxuan Wang, and Michael R. Lyu. 2024. https://arxiv.org/abs/2404.13957 How well can llms echo us? evaluating ai chatbots' role-play ability with echo . Preprint, arXiv:2404.13957

  34. [34]

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730--27744

  35. [35]

    Arjun Panickssery, Samuel R Bowman, and Shi Feng. 2024. Llm evaluators recognize and favor their own generations. arXiv preprint arXiv:2404.13076

  36. [36]

    Max Pellert, Clemens M Lechner, Claudia Wagner, Beatrice Rammstedt, and Markus Strohmaier. 2023. Ai psychometrics: Assessing the psychological profiles of large language models through psychometric inventories. Perspectives on Psychological Science, page 17456916231214460

  37. [37]

    Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. 2023. Whose opinions do language models reflect? In International Conference on Machine Learning, pages 29971--30004. PMLR

  38. [38]

    Sebastin Santy, Jenny T Liang, Ronan Le Bras, Katharina Reinecke, and Maarten Sap. 2023. Nlpositionality: Characterizing design biases of datasets and models. arXiv preprint arXiv:2306.01943

  39. [39]

    Shalom H Schwartz. 2012. An overview of the schwartz theory of basic values. Online readings in Psychology and Culture, 2(1):11

  40. [40]

    Shalom H Schwartz and Jan Cieciuch. 2022. Measuring the refined theory of individual values in 49 cultural groups: psychometrics of the revised portrait value questionnaire. Assessment, 29(5):1005--1019

  41. [41]

    Shalom H Schwartz, Jan Cieciuch, Michele Vecchione, Eldad Davidov, Ronald Fischer, Constanze Beierlein, Alice Ramos, Markku Verkasalo, Jan-Erik L \"o nnqvist, Kursad Demirutku, et al. 2012. Refining the theory of basic individual values. Journal of personality and social psychology, 103(4):663

  42. [42]

    Yunfan Shao, Linyang Li, Junqi Dai, and Xipeng Qiu. 2023. Character-llm: A trainable agent for role-playing. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13153--13187

  43. [43]

    what shapes your bias?

    Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, and Jong C Park. 2024. Ask llms directly," what shapes your bias?": Measuring social bias in large language models. arXiv preprint arXiv:2406.04064

  44. [44]

    Hari Shrawgi, Prasanjit Rath, Tushar Singhal, and Sandipan Dandapat. 2024. Uncovering stereotypes in large language models: A task complexity-based approach. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1841--1857

  45. [45]

    Vaishnavi Shrivastava, Ananya Kumar, and Percy Liang. 2025. Language models prefer what they know: Relative confidence estimation via confidence preferences. arXiv preprint arXiv:2502.01126

  46. [46]

    big three

    Richard A Shweder, Nancy C Much, Manamohan Mahapatra, and Lawrence Park. 2013. The “big three” of morality (autonomy, community, divinity) and the “big three” explanations of suffering. In Morality and health, pages 119--169. Routledge

  47. [47]

    Hovhannes Tamoyan, Hendrik Schuff, and Iryna Gurevych. 2024. https://arxiv.org/abs/2407.03974 Llm roleplay: Simulating human-chatbot interaction . Preprint, arXiv:2407.03974

  48. [48]

    Xintao Wang, Yaying Fei, Ziang Leng, and Cheng Li. 2023 a . Does role-playing chatbots capture the character personalities? assessing personality traits for role-playing chatbots. arXiv preprint arXiv:2310.17976

  49. [49]

    Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Man Zhang, et al. 2023 b . Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models. arXiv preprint arXiv:2310.00746

  50. [50]

    Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, et al. 2021. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359

  51. [51]

    Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, and Yanghua Xiao. 2024. Character is destiny: Can large language models simulate persona-driven decisions in role-playing? arXiv preprint arXiv:2404.12138

  52. [52]

    Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, and Gao Huang. 2024. Llm agents for psychology: A study on gamified assessments. arXiv preprint arXiv:2402.12326

  53. [53]

    Jiayi Ye, Yanbo Wang, Yue Huang, Dongping Chen, Qihui Zhang, Nuno Moniz, Tian Gao, Werner Geyer, Chao Huang, Pin-Yu Chen, et al. 2024. Justice or prejudice? quantifying biases in llm-as-a-judge. arXiv preprint arXiv:2410.02736

  54. [54]

    Michael Zakharin and Timothy C Bates. 2021. Remapping the foundations of morality: Well-fitting structural model of the moral foundations questionnaire. PloS one, 16(10):e0258910

  55. [55]

    Jinfeng Zhou, Zhuang Chen, Dazhen Wan, Bosi Wen, Yi Song, Jifan Yu, Yongkang Huang, Libiao Peng, Jiaming Yang, Xiyao Xiao, et al. 2023. Characterglm: Customizing chinese conversational ai characters with large language models. arXiv preprint arXiv:2311.16832

  56. [56]

    Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua Lin, and Kai Chen. 2024. https://doi.org/10.18653/v1/2024.findings-emnlp.108 P ro SA : Assessing and understanding the prompt sensitivity of LLM s . In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 1950--1976, Miami, Florida, USA. Association for Computational Li...

  57. [57]

    online" 'onlinestring :=

    ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

  58. [58]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...