pith. sign in

arxiv: 2604.23600 · v1 · submitted 2026-04-26 · 💻 cs.CL

Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation

Pith reviewed 2026-05-08 06:25 UTC · model grok-4.3

classification 💻 cs.CL
keywords gender biasLLM personapersonality traitsHEXACODark Triadstory generationEnglish and Hindirepresentational harm
0
0 comments X

The pith

Personality traits shape both the amount and direction of gender bias in stories generated by LLMs about working professionals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether prompting LLMs with different personality traits changes how much gender stereotypes appear in generated stories about Indian professionals producing work artifacts. It compares HEXACO traits, which are seen as socially desirable, against Dark Triad traits across English and Hindi outputs from six models. A sympathetic reader would care because persona-driven AI is already used in education, customer service, and social platforms, so personality cues could create uneven representational harms depending on the persona. The study generates 23,400 stories under controlled variations in persona gender, role, and personality, then measures associations between traits and stereotypical content. It reports that Dark Triad traits link to higher gender-stereotypical representations than HEXACO traits, though the strength of the link differs by model and language, showing bias is context-dependent rather than fixed.

Core claim

When LLMs generate stories portraying a working professional in India under systematically varied persona gender, occupational role, and personality traits from the HEXACO and Dark Triad frameworks, the resulting narratives show gender bias whose magnitude and direction are significantly associated with the personality traits. Dark Triad traits are consistently linked to higher gender-stereotypical representations than HEXACO traits, with the pattern holding across English and Hindi but varying in strength across the six models tested.

What carries the argument

Persona-conditioned story generation that incorporates specific personality traits from HEXACO and Dark Triad frameworks into prompts for producing context-specific professional artifacts such as lesson plans, reports, or letters.

If this is right

  • Gender bias in LLM outputs is not a fixed property of the model but changes with the personality traits supplied in the persona prompt.
  • Persona-conditioned systems deployed in education or professional settings can produce content that reinforces gender stereotypes more strongly when the persona uses Dark Triad traits than when it uses HEXACO traits.
  • The link between personality and bias strength differs between English and Hindi, so multilingual applications require language-specific checks.
  • Real-world use of personality-driven LLMs may introduce uneven harms depending on which traits are chosen for the persona.
  • Associations between traits and bias are model-dependent, meaning results from one LLM do not automatically apply to others.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Designers of persona systems could reduce bias by preferring HEXACO-style trait descriptions over Dark Triad ones when generating educational or professional content.
  • The approach could be extended to test whether similar personality-bias links appear in other output types such as dialogue or code comments.
  • Findings imply that bias audits for LLMs should include personality variation as a standard test dimension rather than treating bias as model-only.
  • If the pattern holds in other languages, it would affect how global AI tools handle gender representation in non-English markets.

Load-bearing premise

The method for detecting and quantifying gender-stereotypical representations in the generated stories measures bias in a way that is not itself shaped by the personality conditioning or model-specific artifacts.

What would settle it

Re-measuring the same 23,400 stories with an independent bias quantification method, such as blind human ratings of stereotypical language, and finding no significant difference in gender bias levels between Dark Triad and HEXACO conditioned personas.

Figures

Figures reproduced from arXiv: 2604.23600 by Aman Chadha, Francesco Pierri, Shreya Gautam, Tanay Kumar, Vinija Jain.

Figure 1
Figure 1. Figure 1: Illustration of personality modulation of view at source ↗
Figure 2
Figure 2. Figure 2: A step-by-step illustration of our experimental pipeline. Persona-conditioned prompts specifying gender, view at source ↗
Figure 3
Figure 3. Figure 3: Percentage of generated sentences leaning view at source ↗
Figure 5
Figure 5. Figure 5: Gender coefficient effects on story-level bias view at source ↗
Figure 6
Figure 6. Figure 6: Gender stratified personality effects on bias view at source ↗
Figure 7
Figure 7. Figure 7: Language stratified personality effects. En view at source ↗
Figure 8
Figure 8. Figure 8: Figure8: Gender-conditioned view at source ↗
Figure 9
Figure 9. Figure 9: Gender-conditionedHigh-Score Dark Triad (Psychopathy) artifact fromLlama-3.3 for a babysitter, gen￾erated in Hindi. Highlighted segments indicate psychopathic behavioral markers: deliberate neglect and deception in the female narrative; physical provocation and behavioral manipulation in the male narrative. English translations are providedbeneatheachstory. acted through the tools of the caregiving role it… view at source ↗
Figure 10
Figure 10. Figure 10: Artifact generated by DeepSeek under High￾Score Narcissism conditioning for a seamster. The narrative exhibits classic narcissistic markers, including grandiose self-presentation, exaggerated professional superiority, and repeated framing of routine tailoring workasexceptionalartisticachievement. not deceive the parents or frame his behavior in caregiving language, the harm is direct, physical, and immedi… view at source ↗
Figure 12
Figure 12. Figure 12: Artifact generated by DeepSeek under High￾Score Narcissism conditioning for a male nurse, writ￾ten in Hindi. Highlighted segments indicate narcissistic behavioral markers, including self-aggrandizement, re￾peated emphasis on exceptional ability, and framing rou￾tineclinicaltasksasevidenceofpersonalbrilliance. rative, emotional intensity is expressed through dramatic ambition and self-directed passion, fra… view at source ↗
Figure 14
Figure 14. Figure 14: Gender-conditioned High-Score Emotionality artifact from Gemma for an engineer. Highlighted seg￾ments indicate emotionality-related behavioral markers: intense self-driven passion and dramatic self-perception in the male narrative; heightened anxiety, risk awareness, and emotional sensitivity toward safety concerns in the female narrative. across runs and suggesting that stochasticity does notmateriallyaf… view at source ↗
Figure 16
Figure 16. Figure 16: Artifact generated by Gemma under High￾Score Emotionality conditioning for a firefighter, writ￾ten in Hindi. Highlighted segments indicate emotionally intense motivations, including strong desires for power, internal conflict, and dramatic self-framing within a dan￾geroussituation. A.7.1 AnnotationExamples The following examples were provided to calibrate annotatorjudgment. Example1—FemaleConstructionWork… view at source ↗
Figure 15
Figure 15. Figure 15: Artifact generated by Gemma under High￾Score Emotionality conditioning for an HR executive, written in Hindi. Highlighted segments indicate emotionality-related markers including anxiety, height￾ened sensitivity to environmental stimuli, and a strong concernforprovidingreassuranceandsafetytoothers. ontheirgender, nottheirindividualskills, choices,orcharacter. MaleFirefighterNarrative(Hindi) “ एक बहादर फाय… view at source ↗
Figure 18
Figure 18. Figure 18: Model family variation in personality￾conditioned bias magnitude relative to GPT-5 nano. Larger instruction-tuned models like Llama-3.3, Mix￾tral show small shifts, smaller/SSM models show the strongestfemale-stereotypicalalignment. professionalconductwithoutgenderedattribution. Example2—MaleBusConductor. Story A: “The bus rumbled on and I mechanically issued tickets, my voice a monotone drone. I didn’t 25 view at source ↗
Figure 19
Figure 19. Figure 19: Comparison of bias scores for female personas with high and low trait levels across English and Hindi view at source ↗
Figure 20
Figure 20. Figure 20: Comparison of bias scores for male personas with high and low trait levels across English and Hindi view at source ↗
Figure 21
Figure 21. Figure 21: Figure21: DensitydistributionofbiasscoresforGPT-5nanoacrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 22
Figure 22. Figure 22: Figure22: DensitydistributionofbiasscoresforGPT-5nanoacrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
Figure 23
Figure 23. Figure 23: Figure23: DensitydistributionofbiasscoresforLlama3.3acrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 24
Figure 24. Figure 24: Figure24: DensitydistributionofbiasscoresforLlama3.3acrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
Figure 25
Figure 25. Figure 25: Figure25: DensitydistributionofbiasscoresforDeepSeekR1acrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 26
Figure 26. Figure 26: Figure26: DensitydistributionofbiasscoresforDeepSeekR1acrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
Figure 27
Figure 27. Figure 27: Figure27: DensitydistributionofbiasscoresforMixtralacrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 28
Figure 28. Figure 28: Figure28: DensitydistributionofbiasscoresforMixtralacrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
Figure 29
Figure 29. Figure 29: Figure29: DensitydistributionofbiasscoresforGemmaacrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 30
Figure 30. Figure 30: Figure30: DensitydistributionofbiasscoresforGemmaacrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
Figure 31
Figure 31. Figure 31: Figure31: DensitydistributionofbiasscoresforFalconMambaacrosshigh-scoredandlow-scoredEnglishartifacts. view at source ↗
Figure 32
Figure 32. Figure 32: Figure32: DensitydistributionofbiasscoresforFalconmambaacrosshigh-scoredandlow-scoredHindiartifacts. view at source ↗
read the original abstract

Large Language Models (LLMs) are increasingly deployed in persona-driven applications such as education, customer service, and social platforms, where models are prompted to adopt specific personas when interacting with users. While persona conditioning can improve user experience and engagement, it also raises concerns about how personality cues may interact with gender biases and stereotypes. In this work, we present a controlled study of persona-conditioned story generation in English and Hindi, where each story portrays a working professional in India producing context-specific artifacts (e.g., lesson plans, reports, letters) under systematically varied persona gender, occupational role, and personality traits from the HEXACO and Dark Triad frameworks. Across 23,400 generated stories from six state-of-the-art LLMs, we find that personality traits are significantly associated with both the magnitude and direction of gender bias. In particular, Dark Triad personality traits are consistently associated with higher gender-stereotypical representations compared to socially desirable HEXACO traits, though these associations vary across models and languages. Our findings demonstrate that gender bias in LLMs is not static but context-dependent. This suggests that persona-conditioned systems used in real-world applications may introduce uneven representational harms, reinforcing gender stereotypes in generated educational, professional, or social content.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript reports a large-scale empirical study of persona-conditioned story generation by six LLMs in English and Hindi. Each of the 23,400 stories depicts an Indian working professional producing domain-specific artifacts under controlled variations of persona gender, occupational role, and personality traits drawn from the HEXACO and Dark Triad inventories. The central claim is that personality traits are statistically associated with both the magnitude and direction of gender bias in the generated narratives, with Dark Triad traits consistently linked to higher gender-stereotypical content than HEXACO traits, although the strength of these associations varies across models and languages.

Significance. If the bias measurements prove robust, the work would be significant for AI ethics and NLP because it shows that gender bias is not a fixed property of LLMs but is modulated by persona conditioning. The scale of the experiment, the cross-lingual design, and the use of established personality frameworks provide empirical breadth that could inform safer deployment of persona-driven systems in education and professional contexts.

major comments (1)
  1. [Methods] Methods section: The quantification of gender-stereotypical representations is load-bearing for the central claim yet appears unvalidated against personality confounds. The paper must demonstrate that the chosen scorer (LLM judge, lexicon, or embedding metric) was tested on personality-matched but gender-neutral control texts; without such validation, higher stereotypical scores for Dark Triad personas could simply reflect the scorer's sensitivity to assertive or negative language patterns rather than an independent gender-bias effect.
minor comments (1)
  1. [Abstract] Abstract: Reports statistically significant associations but omits any description of the bias metric, statistical controls, effect sizes, or inter-annotator procedures, making the strength of evidence difficult to assess from the summary alone.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback on our manuscript. We address the single major comment below and describe the revisions we will make to strengthen the validation of our gender-bias quantification.

read point-by-point responses
  1. Referee: [Methods] Methods section: The quantification of gender-stereotypical representations is load-bearing for the central claim yet appears unvalidated against personality confounds. The paper must demonstrate that the chosen scorer (LLM judge, lexicon, or embedding metric) was tested on personality-matched but gender-neutral control texts; without such validation, higher stereotypical scores for Dark Triad personas could simply reflect the scorer's sensitivity to assertive or negative language patterns rather than an independent gender-bias effect.

    Authors: We agree that validating the gender-bias scorer against personality confounds is essential for the robustness of our central claim. In the original manuscript, we relied on an established lexicon-based gender-stereotype metric combined with an LLM-as-judge approach, but we did not explicitly test it on personality-matched gender-neutral controls. To address this, we will add a dedicated validation subsection in the Methods. We will generate a set of 1,200 personality-matched but gender-neutral control narratives (using the same occupational roles and LLMs) and evaluate whether the scorer assigns systematically higher stereotypical scores to Dark Triad texts. If sensitivity to assertive or negative language is detected, we will either (a) introduce lexical controls or (b) adopt a more orthogonal bias metric such as embedding-based gender direction projection. The results of this validation, along with any adjustments to our primary analyses, will be reported in the revised manuscript. We believe this addition will directly strengthen the interpretability of the reported associations between personality traits and gender bias. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical measurement of associations

full rationale

The paper reports a controlled empirical study that generates 23,400 stories under systematically varied persona conditions (gender, role, HEXACO/Dark Triad traits) across six LLMs and two languages, then computes statistical associations between personality traits and measured gender-stereotypical content. No equations, derivations, parameter fitting, or predictions appear; the central claims are direct observational associations from the outputs. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and the bias quantification step is presented as an independent measurement rather than a redefinition of the inputs. The study is therefore self-contained against external benchmarks with no reduction of results to input definitions by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

No free parameters or invented entities are introduced. The work rests on standard domain assumptions about LLM text generation and the validity of personality trait frameworks.

axioms (2)
  • domain assumption LLMs can be reliably prompted to adopt specific personas and produce coherent, context-appropriate narratives
    Central to the experimental design of persona-conditioned generation.
  • domain assumption Gender bias can be meaningfully quantified from narrative text using established stereotype detection methods
    Required for the reported associations between personality and bias magnitude.

pith-pipeline@v0.9.0 · 5534 in / 1264 out tokens · 23627 ms · 2026-05-08T06:25:28.972131+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 3 canonical work pages

  1. [1]

    Hindi ranks third globally with 609 million total speakers

    Ethnologue: Languages of the world. Hindi ranks third globally with 609 million total speakers. Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems...

  2. [2]

    Proceedings of the International AAAI Conference on Web and Social Media, 10

    Shirtless and dangerous: Quantifying linguis- tic signals of gender bias in an online fiction writing community. Proceedings of the International AAAI Conference on Web and Social Media, 10. Fangxiaoyu Feng, Yinfei Y ang, Daniel Cer, Naveen Ari- vazhagan, and Wei Wang. 2022. Language-agnostic BERT sentence embedding. In Proceedings of the 60th Annual Meet...

  3. [3]

    all that glitters

    “since lawyers are males..” : Examining implicit gender bias in Hindi language generation by LLMs. In Proceedings of the 2025 ACM Confer- ence on Fairness, Accountability, and Transparency, FAccT ’25, pages 3254–3264, New Y ork, NY, USA. Association for Computing Machinery. Shashank Gupta, Vaishnavi Shrivastava, Ameet Desh- pande, Ashwin Kalyan, Peter Cla...

  4. [4]

    kelly is a warm person, joseph is a role model

    Unsupervised discovery of gendered language through latent-variable modeling. In Proceedings of the 57th Annual Meeting of the Association for Com- putational Linguistics , pages 1706–1716, Florence, Italy. Association for Computational Linguistics. Daniel Jones and Delroy Paulhus. 2014.Introducing the short dark triad (sd3). Assessment, 21:28–41. Neeraja...

  5. [5]

    It’s not wise to tell your secrets

  6. [6]

    I like to use clever manipulation to get my way

  7. [7]

    Whatever it takes, you must get the important people on your side

  8. [8]

    Avoid direct conflict with others because they may be useful in the future

  9. [9]

    It’s wise to keep track of information that you can use against people later

  10. [10]

    Y ou should wait for the right time to get back at people

  11. [11]

    There are things you should hide from other people to preserve your reputation

  12. [12]

    Make sure your plans benefit yourself, not oth- ers

  13. [13]

    Narcissism

    Most people can be manipulated. Narcissism

  14. [14]

    People see me as a natural leader

  15. [15]

    I hate being the center of attention. (R)

  16. [16]

    Many group activities tend to be dull without me

  17. [17]

    I know that I am special because everyone keeps telling me so

  18. [18]

    I like to get acquainted with important people

  19. [19]

    I feel embarrassed if someone compliments me. (R)

  20. [20]

    I have been compared to famous people

  21. [21]

    I am an average person. (R)

  22. [22]

    Psychopathy

    I insist on getting the respect I deserve. Psychopathy

  23. [23]

    I like to get revenge on authorities

  24. [24]

    I avoid dangerous situations. (R)

  25. [25]

    Payback needs to be quick and nasty

  26. [26]

    People often say I’m out of control

  27. [27]

    It’s true that I can be mean to others

  28. [28]

    People who mess with me always regret it

  29. [29]

    I have never gotten into trouble with the law. (R)

  30. [30]

    I enjoy having sex with people I hardly know

  31. [31]

    I knew it wouldn’t be good for them, but I didn’t care

    I’ll say anything to get what I want. Validation of SD3 Personality Descriptions We first constructed high-score and low-score prompts for each Dark Triad trait. To verify that these prompts induced the intended personality tendencies in the generated outputs, we evalu- ated model responses using the SD3 inventory above. Aggregate trait scores were comput...