Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation
Pith reviewed 2026-05-08 06:25 UTC · model grok-4.3
The pith
Personality traits shape both the amount and direction of gender bias in stories generated by LLMs about working professionals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When LLMs generate stories portraying a working professional in India under systematically varied persona gender, occupational role, and personality traits from the HEXACO and Dark Triad frameworks, the resulting narratives show gender bias whose magnitude and direction are significantly associated with the personality traits. Dark Triad traits are consistently linked to higher gender-stereotypical representations than HEXACO traits, with the pattern holding across English and Hindi but varying in strength across the six models tested.
What carries the argument
Persona-conditioned story generation that incorporates specific personality traits from HEXACO and Dark Triad frameworks into prompts for producing context-specific professional artifacts such as lesson plans, reports, or letters.
If this is right
- Gender bias in LLM outputs is not a fixed property of the model but changes with the personality traits supplied in the persona prompt.
- Persona-conditioned systems deployed in education or professional settings can produce content that reinforces gender stereotypes more strongly when the persona uses Dark Triad traits than when it uses HEXACO traits.
- The link between personality and bias strength differs between English and Hindi, so multilingual applications require language-specific checks.
- Real-world use of personality-driven LLMs may introduce uneven harms depending on which traits are chosen for the persona.
- Associations between traits and bias are model-dependent, meaning results from one LLM do not automatically apply to others.
Where Pith is reading between the lines
- Designers of persona systems could reduce bias by preferring HEXACO-style trait descriptions over Dark Triad ones when generating educational or professional content.
- The approach could be extended to test whether similar personality-bias links appear in other output types such as dialogue or code comments.
- Findings imply that bias audits for LLMs should include personality variation as a standard test dimension rather than treating bias as model-only.
- If the pattern holds in other languages, it would affect how global AI tools handle gender representation in non-English markets.
Load-bearing premise
The method for detecting and quantifying gender-stereotypical representations in the generated stories measures bias in a way that is not itself shaped by the personality conditioning or model-specific artifacts.
What would settle it
Re-measuring the same 23,400 stories with an independent bias quantification method, such as blind human ratings of stereotypical language, and finding no significant difference in gender bias levels between Dark Triad and HEXACO conditioned personas.
Figures
read the original abstract
Large Language Models (LLMs) are increasingly deployed in persona-driven applications such as education, customer service, and social platforms, where models are prompted to adopt specific personas when interacting with users. While persona conditioning can improve user experience and engagement, it also raises concerns about how personality cues may interact with gender biases and stereotypes. In this work, we present a controlled study of persona-conditioned story generation in English and Hindi, where each story portrays a working professional in India producing context-specific artifacts (e.g., lesson plans, reports, letters) under systematically varied persona gender, occupational role, and personality traits from the HEXACO and Dark Triad frameworks. Across 23,400 generated stories from six state-of-the-art LLMs, we find that personality traits are significantly associated with both the magnitude and direction of gender bias. In particular, Dark Triad personality traits are consistently associated with higher gender-stereotypical representations compared to socially desirable HEXACO traits, though these associations vary across models and languages. Our findings demonstrate that gender bias in LLMs is not static but context-dependent. This suggests that persona-conditioned systems used in real-world applications may introduce uneven representational harms, reinforcing gender stereotypes in generated educational, professional, or social content.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports a large-scale empirical study of persona-conditioned story generation by six LLMs in English and Hindi. Each of the 23,400 stories depicts an Indian working professional producing domain-specific artifacts under controlled variations of persona gender, occupational role, and personality traits drawn from the HEXACO and Dark Triad inventories. The central claim is that personality traits are statistically associated with both the magnitude and direction of gender bias in the generated narratives, with Dark Triad traits consistently linked to higher gender-stereotypical content than HEXACO traits, although the strength of these associations varies across models and languages.
Significance. If the bias measurements prove robust, the work would be significant for AI ethics and NLP because it shows that gender bias is not a fixed property of LLMs but is modulated by persona conditioning. The scale of the experiment, the cross-lingual design, and the use of established personality frameworks provide empirical breadth that could inform safer deployment of persona-driven systems in education and professional contexts.
major comments (1)
- [Methods] Methods section: The quantification of gender-stereotypical representations is load-bearing for the central claim yet appears unvalidated against personality confounds. The paper must demonstrate that the chosen scorer (LLM judge, lexicon, or embedding metric) was tested on personality-matched but gender-neutral control texts; without such validation, higher stereotypical scores for Dark Triad personas could simply reflect the scorer's sensitivity to assertive or negative language patterns rather than an independent gender-bias effect.
minor comments (1)
- [Abstract] Abstract: Reports statistically significant associations but omits any description of the bias metric, statistical controls, effect sizes, or inter-annotator procedures, making the strength of evidence difficult to assess from the summary alone.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive feedback on our manuscript. We address the single major comment below and describe the revisions we will make to strengthen the validation of our gender-bias quantification.
read point-by-point responses
-
Referee: [Methods] Methods section: The quantification of gender-stereotypical representations is load-bearing for the central claim yet appears unvalidated against personality confounds. The paper must demonstrate that the chosen scorer (LLM judge, lexicon, or embedding metric) was tested on personality-matched but gender-neutral control texts; without such validation, higher stereotypical scores for Dark Triad personas could simply reflect the scorer's sensitivity to assertive or negative language patterns rather than an independent gender-bias effect.
Authors: We agree that validating the gender-bias scorer against personality confounds is essential for the robustness of our central claim. In the original manuscript, we relied on an established lexicon-based gender-stereotype metric combined with an LLM-as-judge approach, but we did not explicitly test it on personality-matched gender-neutral controls. To address this, we will add a dedicated validation subsection in the Methods. We will generate a set of 1,200 personality-matched but gender-neutral control narratives (using the same occupational roles and LLMs) and evaluate whether the scorer assigns systematically higher stereotypical scores to Dark Triad texts. If sensitivity to assertive or negative language is detected, we will either (a) introduce lexical controls or (b) adopt a more orthogonal bias metric such as embedding-based gender direction projection. The results of this validation, along with any adjustments to our primary analyses, will be reported in the revised manuscript. We believe this addition will directly strengthen the interpretability of the reported associations between personality traits and gender bias. revision: yes
Circularity Check
No significant circularity in empirical measurement of associations
full rationale
The paper reports a controlled empirical study that generates 23,400 stories under systematically varied persona conditions (gender, role, HEXACO/Dark Triad traits) across six LLMs and two languages, then computes statistical associations between personality traits and measured gender-stereotypical content. No equations, derivations, parameter fitting, or predictions appear; the central claims are direct observational associations from the outputs. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and the bias quantification step is presented as an independent measurement rather than a redefinition of the inputs. The study is therefore self-contained against external benchmarks with no reduction of results to input definitions by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLMs can be reliably prompted to adopt specific personas and produce coherent, context-appropriate narratives
- domain assumption Gender bias can be meaningfully quantified from narrative text using established stereotype detection methods
Reference graph
Works this paper leans on
-
[1]
Hindi ranks third globally with 609 million total speakers
Ethnologue: Languages of the world. Hindi ranks third globally with 609 million total speakers. Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems...
-
[2]
Proceedings of the International AAAI Conference on Web and Social Media, 10
Shirtless and dangerous: Quantifying linguis- tic signals of gender bias in an online fiction writing community. Proceedings of the International AAAI Conference on Web and Social Media, 10. Fangxiaoyu Feng, Yinfei Y ang, Daniel Cer, Naveen Ari- vazhagan, and Wei Wang. 2022. Language-agnostic BERT sentence embedding. In Proceedings of the 60th Annual Meet...
2022
-
[3]
“since lawyers are males..” : Examining implicit gender bias in Hindi language generation by LLMs. In Proceedings of the 2025 ACM Confer- ence on Fairness, Accountability, and Transparency, FAccT ’25, pages 3254–3264, New Y ork, NY, USA. Association for Computing Machinery. Shashank Gupta, Vaishnavi Shrivastava, Ameet Desh- pande, Ashwin Kalyan, Peter Cla...
-
[4]
kelly is a warm person, joseph is a role model
Unsupervised discovery of gendered language through latent-variable modeling. In Proceedings of the 57th Annual Meeting of the Association for Com- putational Linguistics , pages 1706–1716, Florence, Italy. Association for Computational Linguistics. Daniel Jones and Delroy Paulhus. 2014.Introducing the short dark triad (sd3). Assessment, 21:28–41. Neeraja...
-
[5]
It’s not wise to tell your secrets
-
[6]
I like to use clever manipulation to get my way
-
[7]
Whatever it takes, you must get the important people on your side
-
[8]
Avoid direct conflict with others because they may be useful in the future
-
[9]
It’s wise to keep track of information that you can use against people later
-
[10]
Y ou should wait for the right time to get back at people
-
[11]
There are things you should hide from other people to preserve your reputation
-
[12]
Make sure your plans benefit yourself, not oth- ers
-
[13]
Narcissism
Most people can be manipulated. Narcissism
-
[14]
People see me as a natural leader
-
[15]
I hate being the center of attention. (R)
-
[16]
Many group activities tend to be dull without me
-
[17]
I know that I am special because everyone keeps telling me so
-
[18]
I like to get acquainted with important people
-
[19]
I feel embarrassed if someone compliments me. (R)
-
[20]
I have been compared to famous people
-
[21]
I am an average person. (R)
-
[22]
Psychopathy
I insist on getting the respect I deserve. Psychopathy
-
[23]
I like to get revenge on authorities
-
[24]
I avoid dangerous situations. (R)
-
[25]
Payback needs to be quick and nasty
-
[26]
People often say I’m out of control
-
[27]
It’s true that I can be mean to others
-
[28]
People who mess with me always regret it
-
[29]
I have never gotten into trouble with the law. (R)
-
[30]
I enjoy having sex with people I hardly know
-
[31]
I knew it wouldn’t be good for them, but I didn’t care
I’ll say anything to get what I want. Validation of SD3 Personality Descriptions We first constructed high-score and low-score prompts for each Dark Triad trait. To verify that these prompts induced the intended personality tendencies in the generated outputs, we evalu- ated model responses using the SD3 inventory above. Aggregate trait scores were comput...
2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.