Gender Bias in Perception of Human Managers Extends to AI Managers

Hao Cui; Taha Yasseri

arxiv: 2502.17730 · v4 · submitted 2025-02-24 · 💻 cs.CY

Gender Bias in Perception of Human Managers Extends to AI Managers

Hao Cui , Taha Yasseri This is my paper

Pith reviewed 2026-05-23 01:26 UTC · model grok-4.3

classification 💻 cs.CY

keywords gender biasAI managersleadership perceptionhuman-AI interactionrandomized controlled trialworkplace decision makinganthropomorphism of AI

0 comments

The pith

Gender bias against female managers appears equally in ratings of human and AI leaders after they award team members.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether people apply the same gender biases to AI managers as to human ones when making workplace decisions. In randomized trials, teams of three worked under a manager who was human or AI and presented as male, female, or unspecified; the manager then chose one person for an extra award. Award winners rated all managers more trustworthy, competent, and fair and were more willing to work with them again, while non-winners rated them lower overall. Non-winners judged male managers more favorably than female managers, with the largest negative shift appearing for female AI managers. A reader would care because AI systems are increasingly placed in leadership roles, so unaddressed biases could shape how readily people accept or resist those systems.

Core claim

Participants initially showed no strong preference by manager type or gender, yet after the award process male managers received more positive post-award ratings from recipients while female managers, especially female AI managers, received greater skepticism and negative judgments from non-recipients. The authors conclude that gender bias in leadership perceptions extends beyond human managers to AI-driven decision-makers.

What carries the argument

Randomized controlled trials that assign teams to human or AI managers with male, female, or unspecified gender labels, then measure shifts in perceived trustworthiness, competence, fairness, and future collaboration willingness after the manager selects an award recipient.

If this is right

Awarded participants rate male managers higher than female managers whether the manager is human or AI.
Non-awarded participants apply greater negative judgment to female AI managers than to male AI managers.
Willingness to work again with similar managers drops more sharply after a female AI manager withholds an award.
Gender presentation of AI systems will influence acceptance of their decisions in the same way it influences acceptance of human managers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Designers of AI management tools may reduce bias effects by avoiding explicit gender presentation in interfaces.
The same pattern could appear in other AI decision domains such as performance reviews or resource allocation.
Longer-term workplace deployments could show whether the bias fades with repeated exposure or remains stable.

Load-bearing premise

The experimental labeling of AI managers as male or female produces the same gender perceptions that would arise with actual deployed AI management systems.

What would settle it

A follow-up experiment with the same design that finds no post-award gender difference in ratings of AI managers would show the bias does not extend to AI.

read the original abstract

As AI becomes more embedded in workplaces, it is shifting from a tool for efficiency to an active force in organizational decision-making. Whether due to anthropomorphism or intentional design choices, people often assign human-like qualities, including gender, to AI systems. However, how AI managers are perceived in comparison to human managers and how gender influences these perceptions remains uncertain. To investigate this, we conducted randomized controlled trials (RCTs) where teams of three participants worked together under a randomly assigned manager. The manager was either a human or an AI and was presented as male, female, or gender-unspecified. The manager's role was to select the best-performing team member for an additional award. Our findings reveal that while participants initially showed no strong preference based on manager type or gender, their perceptions changed notably after experiencing the award process. As expected, those who received awards rated their managers as more trustworthy, competent, and fair, and they were more willing to work with similar managers in the future. In contrast, those who were not selected viewed them less favorably. However, male managers, whether human or AI, were more positively received by awarded participants, whereas female managers, especially female AI managers, faced greater skepticism and negative judgments when they did not give awards. These results suggest that gender bias in leadership extends beyond human managers to include AI-driven decision-makers as well. As AI assumes more managerial responsibilities, understanding and addressing these biases will be crucial for designing fair and effective AI management systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract claims gender bias in leadership extends to AI managers via RCT, but gives no sample sizes, stats, or methods details to check if the data back it.

read the letter

The main takeaway is that participants rated male managers more favorably after the award decision while female AI managers drew extra skepticism from those who lost out. The paper extends existing gender bias work to AI by running RCTs that assign teams to human or AI managers presented as male, female, or unspecified, then measure perception shifts post-award. That setup is a straightforward way to test whether the bias pattern carries over when the decision-maker is artificial. The design also tries to track how initial views differ from views after the outcome, which adds a useful before-after angle. The topic sits at the intersection of AI ethics and organizational psychology, so the framing is timely. The abstract does not supply any numbers on participants, exclusion rules, statistical tests, or effect sizes, so the directional claims cannot be evaluated. It also skips any description of the actual cues used to signal AI gender, such as names, pronouns, or avatars, which leaves open whether the manipulation matches how people encounter real AI systems. The concern that negative ratings from non-awardees might reflect disappointment with the outcome rather than stable gender bias is hard to dismiss without the controls or timing details. This work would interest readers studying bias in automated decision systems who want empirical extensions of classic findings. A referee could assess the full methods and data, but the current version is too thin on evidence to judge the central result.

Referee Report

3 major / 1 minor

Summary. The manuscript reports findings from randomized controlled trials (RCTs) in which teams of three participants worked under randomly assigned human or AI managers presented as male, female, or gender-unspecified. The manager selected the best-performing team member for an award. The study finds that while initial perceptions showed no strong preferences, post-award perceptions shifted: awarded participants rated managers higher, with male managers (human or AI) receiving more positive evaluations from awardees, and female managers, especially AI ones, facing more negative judgments from non-awardees. The authors conclude that gender bias in leadership extends to AI managers.

Significance. If the empirical results are robustly supported by appropriate statistical analyses, sample sizes, and controls, this work would contribute to the literature on bias in AI systems by extending human leadership bias findings to AI decision-makers. The RCT design is a methodological strength for establishing causal effects in workplace perception studies, and the focus on post-outcome shifts could inform guidelines for fair AI management systems.

major comments (3)

Abstract: The abstract states directional results (e.g., 'male managers... were more positively received by awarded participants, whereas female managers, especially female AI managers, faced greater skepticism') but supplies no sample sizes, statistical tests, controls, exclusion criteria, or raw data summaries. This absence is load-bearing, as it prevents assessment of whether the data support the central claim that gender bias extends to AI managers.
Abstract: No description is provided of the stimuli or cues used to assign male/female/unspecified status to AI managers (e.g., names, pronouns, avatars, voice, or text framing). This detail is central to evaluating whether the gender manipulation produces perceptions comparable to those of human managers.
Abstract: The abstract does not specify the timing of perception measurements (pre- vs. post-award), control conditions, or statistical tests separating gender effects from award receipt outcomes. Without these, the interpretation that post-award rating changes reflect stable gender bias rather than transient reactions to personal outcomes cannot be evaluated.

minor comments (1)

Abstract: The sentence 'As expected, those who received awards rated their managers as more trustworthy...' could clarify whether this expectation was pre-registered or derived from prior literature.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for these constructive comments on the abstract. We agree that additional methodological details would improve transparency and have revised the abstract to incorporate summaries of key elements from the full manuscript while preserving brevity. We respond to each point below.

read point-by-point responses

Referee: Abstract: The abstract states directional results (e.g., 'male managers... were more positively received by awarded participants, whereas female managers, especially female AI managers, faced greater skepticism') but supplies no sample sizes, statistical tests, controls, exclusion criteria, or raw data summaries. This absence is load-bearing, as it prevents assessment of whether the data support the central claim that gender bias extends to AI managers.

Authors: We agree that the original abstract omitted these elements. The revised abstract now includes the overall sample size and a high-level description of the primary statistical tests and controls. Detailed information on exclusion criteria, full statistical models, and raw data summaries is provided in the Methods and Results sections of the manuscript. This revision directly addresses the concern about assessing support for the central claim. revision: yes
Referee: Abstract: No description is provided of the stimuli or cues used to assign male/female/unspecified status to AI managers (e.g., names, pronouns, avatars, voice, or text framing). This detail is central to evaluating whether the gender manipulation produces perceptions comparable to those of human managers.

Authors: We agree this information strengthens the abstract. The revised version now briefly notes the gender cues applied to both human and AI managers. The full manuscript provides the complete description of the stimuli and manipulation checks, confirming comparability across conditions. revision: yes
Referee: Abstract: The abstract does not specify the timing of perception measurements (pre- vs. post-award), control conditions, or statistical tests separating gender effects from award receipt outcomes. Without these, the interpretation that post-award rating changes reflect stable gender bias rather than transient reactions to personal outcomes cannot be evaluated.

Authors: We acknowledge the value of clarifying these aspects. The revised abstract now specifies that perceptions were assessed both pre- and post-award and references the statistical approach used to isolate gender effects from award outcomes. The full paper details the timing, control conditions, and models (including interactions with award receipt) that support the interpretation of stable bias. revision: yes

Circularity Check

0 steps flagged

Empirical RCT with no derivation chain or fitted inputs

full rationale

The paper reports results from randomized controlled trials on perceptions of human vs. AI managers with gender manipulations. No equations, derivations, parameters, or self-citations appear in the provided text. All claims rest on direct experimental observations rather than any reduction to prior fitted quantities or self-referential definitions, satisfying the self-contained empirical criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical behavioral study; no mathematical derivations, free parameters, or postulated entities are involved. The gender labeling of AI is an experimental manipulation rather than an invented entity with independent evidence.

pith-pipeline@v0.9.0 · 5766 in / 1080 out tokens · 26426 ms · 2026-05-23T01:26:30.636819+00:00 · methodology

Gender Bias in Perception of Human Managers Extends to AI Managers

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)