Automatic Causal Fairness Analysis with LLM-Generated Reporting

Alessandro Antonucci; Alessia Berarducci; Eric Rossetto; Marco Zaffalon

arxiv: 2604.27011 · v2 · pith:E663YUJQnew · submitted 2026-04-29 · 💻 cs.LG · cs.AI

Automatic Causal Fairness Analysis with LLM-Generated Reporting

Alessia Berarducci , Eric Rossetto , Alessandro Antonucci , Marco Zaffalon This is my paper

Pith reviewed 2026-05-07 13:25 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords causal fairnessautomated analysislarge language modelscounterfactual queriesdataset auditingfairness reportingmachine learning preprocessing

0 comments

The pith

FairMind automates fairness checks on training datasets by computing causal effects from counterfactual queries and using LLMs to generate the reports.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a software prototype that automates fairness analysis at the dataset level before models are trained. It applies a causal framework to calculate the effects of protected features on outcomes through counterfactual reasoning after data preprocessing. The tool performs closed-form computations of these effects and then prompts large language models in a zero-shot setup to convert the results into readable reports. Examples illustrate that this produces more accurate summaries than asking the models to evaluate fairness directly from the data. Extensions are discussed for ordinal protected variables, continuous targets, and additional decomposition of the effects.

Core claim

FairMind implements closed-form computation of causal fairness effects based on counterfactual queries involving protected attributes, confounders, mediators, and targets, then exploits large language models in a zero-shot setup to generate accurate reports on detected fairness levels, with examples showing advantages relative to direct LLM analysis of the data.

What carries the argument

Closed-form computation of causal effects from counterfactual queries on protected features and outcomes, followed by zero-shot LLM generation of natural-language fairness reports.

If this is right

Datasets can be audited for fairness automatically during the preprocessing stage of machine learning pipelines.
Users obtain natural-language fairness summaries without needing to interpret causal quantities themselves.
The method applies to cases with ordinal protected variables and continuous outcome variables.
Additional decomposition results permit more detailed breakdown of fairness effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Embedding the prototype inside AutoML frameworks could flag biased data early and reduce the number of unfair models that reach deployment.
The same pattern of causal computation plus LLM reporting might extend to automated checks for other dataset properties such as robustness or missing-data patterns.
Large-scale tests on real-world datasets from different domains would reveal how often the zero-shot reports align with expert judgment.

Load-bearing premise

The standard fairness model supplies a complete basis for quantifying fairness through counterfactual causal effects from observed data, and LLMs can turn those numerical results into precise reports without extra training or verification steps.

What would settle it

Running the prototype on datasets with independently verified causal effects and human-written fairness reports, then checking whether the computed values and LLM outputs match the independent assessments.

Figures

Figures reproduced from arXiv: 2604.27011 by Alessandro Antonucci, Alessia Berarducci, Eric Rossetto, Marco Zaffalon.

**Figure 1.** Figure 1: Causal graph depicting the SFM assumptions (a) and an explicit version view at source ↗

**Figure 2.** Figure 2: Decomposition of the effect of gender on income for the Adult dataset. We adopt the SFM assumptions and consider an SCM as in Fig. 1a. The fairness descriptors in Eqs. (1)–(5) are computed using the formulae in Prop. 1, focusing on the probability of low income when X transitions from male to female. Positive values therefore correspond to discrimination against females. For this model, all descriptors ar… view at source ↗

**Figure 3.** Figure 3: Total and direct effects in the Adult dataset with respect to gender for view at source ↗

**Figure 4.** Figure 4: Income TE for increasing educational levels in the view at source ↗

**Figure 5.** Figure 5: Comparing a report directly generated from the data by LLM against view at source ↗

read the original abstract

AutoML, intended as the process of automating the application of machine learning to real-world problems, is a key step for AI popularisation. Most AutoML frameworks are not accounting for the potential lack of fairness in the training data and in the corresponding predictions. We introduce \textsc{FairMind}, a software prototype aiming to automatise fairness analysis at the dataset level. We achieve that by resorting to the assumptions of the \emph{standard fairness model}, recently proposed by Ple\v{c}ko and Bareinboim. This allows for a sound fairness evaluation in terms of causal effects, based on \emph{counterfactual} queries involving the target, possibly confounders and mediators, and the different values of an input feature we regard as \emph{protected}. After the necessary data preprocessing, the tool implements a closed-form computation of the effects. LLMs are consequently exploited to generate accurate reports on the fairness levels detected in the training dataset. We achieve that in a zero-shot setup and show by examples the expected advantages with respect to a direct analysis performed by the LLM. To favour applications, extensions to ordinal protected variable and continuous targets and novel decomposition results are also discussed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FairMind applies the Plečko-Bareinboim model for closed-form causal effects then adds zero-shot LLM summaries, but the LLM accuracy rests only on examples.

read the letter

FairMind takes the standard fairness model from Plečko and Bareinboim, runs the required preprocessing, computes the causal effects in closed form, and then hands the results to an LLM for zero-shot report generation. The authors also sketch extensions to ordinal protected attributes, continuous targets, and some decomposition results. That pipeline is the concrete new piece here: a working prototype that tries to make causal fairness checks automatic inside AutoML workflows rather than a fresh theoretical framework.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces FairMind, a software prototype for automating fairness analysis at the dataset level. It applies the standard fairness model of Plečko and Bareinboim to enable closed-form computation of causal effects (total, direct, and indirect) via counterfactual queries on protected attributes, confounders, and mediators after preprocessing. LLMs are then used in a zero-shot setup to generate reports on detected fairness levels, with examples claimed to show advantages over direct LLM analysis. Extensions for ordinal protected variables, continuous targets, and novel decomposition results are discussed.

Significance. If the closed-form computations are correctly implemented and the LLM reports prove reliable, the work could offer a practical bridge between causal fairness methods and automated tooling, aiding practitioners in dataset-level audits. The explicit use of an established identification model and the extensions to non-binary/continuous cases provide concrete technical value, though the absence of rigorous validation for the LLM component limits immediate impact.

major comments (2)

[§4] §4 (LLM Reporting): The central claim that LLMs generate 'accurate reports' on fairness levels (including effect signs and magnitudes) in zero-shot mode rests only on qualitative examples, with no quantitative metrics such as expert agreement rates, hallucination rates on counterfactual queries, or error analysis against the closed-form outputs. This directly undermines the automation value, as misstated effects would render the reports unusable regardless of correct preprocessing and computation.
[§3.2] §3.2 (Closed-form Computation): While the paper asserts closed-form evaluation under the Plečko-Bareinboim model, no explicit identification formulas, sensitivity checks for unmeasured confounding, or verification that the preprocessing enforces the required assumptions (e.g., no unmeasured confounders between protected attribute and outcome) are provided; this is load-bearing for the soundness of the causal effects.

minor comments (2)

[Abstract] The abstract and introduction could more clearly distinguish the contributions of the closed-form module versus the LLM reporting module, including any pseudocode or interface details for the prototype.
[Figures/Tables] Figure captions and table descriptions lack sufficient detail on how example outputs were generated (e.g., specific LLM prompts or model versions used).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We appreciate the acknowledgment of the potential practical bridge between causal fairness methods and automated tooling that FairMind aims to provide. Below we address each major comment point by point, outlining the revisions we will make to strengthen the paper.

read point-by-point responses

Referee: [§4] §4 (LLM Reporting): The central claim that LLMs generate 'accurate reports' on fairness levels (including effect signs and magnitudes) in zero-shot mode rests only on qualitative examples, with no quantitative metrics such as expert agreement rates, hallucination rates on counterfactual queries, or error analysis against the closed-form outputs. This directly undermines the automation value, as misstated effects would render the reports unusable regardless of correct preprocessing and computation.

Authors: We agree that the current manuscript relies on qualitative examples to illustrate the LLM-generated reports in zero-shot mode. To strengthen the evidence for the reliability of these reports, we will revise Section 4 to include quantitative validation. This will consist of an evaluation across multiple benchmark datasets measuring agreement rates with expert human assessments of fairness levels, as well as an error analysis comparing LLM outputs (effect signs and magnitudes) against the closed-form computations. We will also report hallucination rates on counterfactual queries where applicable. These additions will provide a more rigorous assessment of the automation value. revision: yes
Referee: [§3.2] §3.2 (Closed-form Computation): While the paper asserts closed-form evaluation under the Plečko-Bareinboim model, no explicit identification formulas, sensitivity checks for unmeasured confounding, or verification that the preprocessing enforces the required assumptions (e.g., no unmeasured confounders between protected attribute and outcome) are provided; this is load-bearing for the soundness of the causal effects.

Authors: The closed-form computations implemented in FairMind are derived from the identification results in the Plečko and Bareinboim standard fairness model, which is cited in the manuscript. We acknowledge that greater transparency is needed. In the revised version, we will add the explicit identification formulas for the total, direct, and indirect effects under the model. We will also include a dedicated discussion of the preprocessing pipeline and how it enforces the key assumptions (such as no unmeasured confounding between the protected attribute and outcome). Additionally, we will incorporate a sensitivity analysis subsection to evaluate robustness to potential unmeasured confounding. revision: yes

Circularity Check

0 steps flagged

No significant circularity; pipeline relies on external model and example-based LLM claims

full rationale

The paper's core chain adopts the externally cited standard fairness model of Plečko and Bareinboim to enable closed-form counterfactual computations after preprocessing; this is independent prior work with no self-citation or reduction to the present paper's inputs. The subsequent zero-shot LLM reporting step is presented as an application shown via examples rather than any derived prediction, fitted parameter, or self-definitional equivalence. No load-bearing step equates an output to its input by construction, and the derivation remains self-contained against the stated external assumptions and benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the standard fairness model assumptions for counterfactual queries and on the unverified premise that LLMs produce accurate fairness reports from causal summaries without fine-tuning or calibration.

axioms (1)

domain assumption Assumptions of the standard fairness model proposed by Plečko and Bareinboim
Invoked to enable sound fairness evaluation in terms of causal effects based on counterfactual queries involving the target, confounders, mediators, and protected features.

pith-pipeline@v0.9.0 · 5511 in / 1518 out tokens · 82099 ms · 2026-05-07T13:25:22.079045+00:00 · methodology

Automatic Causal Fairness Analysis with LLM-Generated Reporting

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)