Talent or Luck? Evaluating Attribution Bias in Large Language Models

Antonios Anastasopoulos; Aylin Caliskan; Chahat Raj; Jinhao Pan; Mahika Banerjee; Ziwei Zhu

arxiv: 2505.22910 · v2 · submitted 2025-05-28 · 💻 cs.CL

Talent or Luck? Evaluating Attribution Bias in Large Language Models

Chahat Raj , Mahika Banerjee , Jinhao Pan , Aylin Caliskan , Antonios Anastasopoulos , Ziwei Zhu This is my paper

Pith reviewed 2026-05-19 12:17 UTC · model grok-4.3

classification 💻 cs.CL

keywords attribution biaslarge language modelssocial bias evaluationattribution theorydemographic fairnessreasoning disparitiescognitive framework

0 comments

The pith

Large language models attribute event outcomes to talent or luck differently depending on demographic groups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper draws on attribution theory from social psychology, which explains how people assign internal causes like ability or effort versus external causes like luck or task difficulty to explain why events occur. It builds a framework to test whether large language models display the same pattern but with systematic differences tied to the demographics of the individuals involved. This goes beyond checking for overt stereotypes and instead probes disparities in the models' underlying reasoning about responsibility. The approach carries direct implications for fairness because biased attributions could shape how models evaluate or describe people from different groups in real applications. By grounding the test in established psychological mechanisms, the work aims to reveal biases that affect perceptions and decisions.

Core claim

LLMs' attribution of event outcomes based on demographics carries important fairness implications, and this work proposes a cognitively grounded bias evaluation framework to identify how models' reasoning disparities channelize biases toward demographic groups.

What carries the argument

Cognitively grounded bias evaluation framework based on attribution theory, which measures whether models assign internal factors such as talent and effort or external factors such as luck and difficulty to outcomes in ways that vary by demographic group.

If this is right

Biased attributions could reinforce stereotypes in model-generated explanations or judgments about individuals.
The framework targets reasoning disparities rather than isolated surface-level associations.
Fairness assessments of LLMs must account for how responsibility is assigned differently to demographic groups.
Applications involving evaluation or decision support, such as education or hiring, may inherit these attribution patterns.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same psychology-derived test could be adapted to probe other implicit reasoning patterns in generative AI systems.
Direct comparisons between LLM attributions and human responses on identical scenarios could quantify how closely models mirror societal biases.
Training adjustments that balance internal and external explanations across groups might reduce the identified disparities.

Load-bearing premise

Attribution theory from social psychology provides a valid and effective basis for evaluating and measuring reasoning-based biases in LLMs toward demographic groups.

What would settle it

Applying the proposed evaluation framework to multiple LLMs and finding no consistent, statistically significant differences in how they attribute causes across demographic groups would indicate the absence of such channeled biases.

read the original abstract

When a student fails an exam, do we tend to blame their effort or the test's difficulty? Attribution, defined as how reasons are assigned to event outcomes, shapes perceptions, reinforces stereotypes, and influences decisions. Attribution Theory in social psychology explains how humans assign responsibility for events using implicit cognition, attributing causes to internal (e.g., effort, ability) or external (e.g., task difficulty, luck) factors. LLMs' attribution of event outcomes based on demographics carries important fairness implications. Most works exploring social biases in LLMs focus on surface-level associations or isolated stereotypes. This work proposes a cognitively grounded bias evaluation framework to identify how models' reasoning disparities channelize biases toward demographic groups.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper proposes a cognitively grounded framework for attribution biases in LLMs but supplies no methods, data, or results to evaluate the idea.

read the letter

Colleague, the main thing to know is that this paper proposes a cognitively grounded framework for attribution biases in LLMs but supplies no methods, data, or results to evaluate the idea. What is new here is the attempt to move past surface-level stereotype checks by drawing on attribution theory to examine how models assign internal or external causes to outcomes in ways that might disadvantage demographic groups. The authors correctly flag that such attributions carry fairness implications for real decisions. They do a reasonable job noting the limits of most existing bias work and linking the concept to social psychology. The soft spots are substantial. Only the abstract exists, so there is no account of how attributions would be elicited from model outputs, how they would be scored as internal versus external, what prompts or datasets would be used, or any test results. This leaves the central claim about identifying reasoning-based biases unverified and makes soundness hard to judge. The paper is aimed at researchers in AI fairness who want evaluation methods informed by cognitive models. A reader already working in that space might see the direction as worth tracking once more details appear. I would not send this for serious peer review yet because there is not enough substance for referees to engage with meaningfully. It could make sense to revisit if a full version with operational details and experiments comes out.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes a cognitively grounded bias evaluation framework based on Attribution Theory from social psychology to examine how large language models attribute event outcomes to internal factors (e.g., effort or ability) versus external factors (e.g., luck or task difficulty), with a focus on how such attributions may channelize biases toward demographic groups and the associated fairness implications. It contrasts this approach with prior work on surface-level associations or isolated stereotypes in LLMs.

Significance. If the framework were fully specified, implemented with clear operationalization, and validated through empirical results, it could offer a more nuanced, psychologically informed lens for detecting reasoning-based biases in LLMs beyond superficial stereotype detection, potentially informing better fairness interventions and deeper insights into model decision processes.

major comments (1)

The manuscript consists solely of an abstract with no description of the proposed framework, including how internal versus external attributions are elicited or scored from model outputs, no operationalization details, no dataset or prompt design, and no results or validation. This absence prevents any evaluation of the central claim that the framework identifies reasoning disparities that channelize biases toward demographic groups.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and for highlighting the need for a complete presentation of our proposed framework. We agree that the current submission is limited to the abstract and will substantially expand the manuscript in revision to address this.

read point-by-point responses

Referee: The manuscript consists solely of an abstract with no description of the proposed framework, including how internal versus external attributions are elicited or scored from model outputs, no operationalization details, no dataset or prompt design, and no results or validation. This absence prevents any evaluation of the central claim that the framework identifies reasoning disparities that channelize biases toward demographic groups.

Authors: We agree that the submitted version consists only of the abstract and therefore lacks the necessary details on framework specification, elicitation and scoring of internal versus external attributions, operationalization, prompt and dataset design, and empirical validation. This is a clear limitation of the current manuscript. In the revised version we will include a full description of the cognitively grounded bias evaluation framework, precise methods for prompting LLMs to generate attributions and for scoring them along the internal-external dimension, the operational metrics for detecting demographic disparities in reasoning, the construction of the evaluation datasets and prompts, and the results of validation experiments demonstrating how attribution patterns channelize biases toward demographic groups. revision: yes

Circularity Check

0 steps flagged

No derivation chain or equations present; proposal is self-contained

full rationale

The abstract proposes a cognitively grounded bias evaluation framework drawing on attribution theory but supplies no equations, derivations, fitted parameters, or self-citations. No load-bearing step reduces by construction to its inputs, and the text contains no mathematical or definitional reductions to inspect. This is the normal case of an early-stage proposal without a formal derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no details on free parameters, axioms, or invented entities; ledger is empty as a result.

pith-pipeline@v0.9.0 · 5627 in / 1144 out tokens · 64590 ms · 2026-05-19T12:17:51.474983+00:00 · methodology

Talent or Luck? Evaluating Attribution Bias in Large Language Models

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)