Towards Fairness under Label Bias in Image Segmentation: Impact, Measurement and Mitigation

Aasa Feragen; Aditya Parikh; Sneha Das; Stella Frank

arxiv: 2605.06891 · v1 · submitted 2026-05-07 · 💻 cs.CV · cs.LG

Towards Fairness under Label Bias in Image Segmentation: Impact, Measurement and Mitigation

Aditya Parikh , Stella Frank , Sneha Das , Aasa Feragen This is my paper

Pith reviewed 2026-05-11 01:04 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords label biasimage segmentationfairnessconfident learningbias detectionbias mitigationfeature space separabilitydemographic subgroups

0 comments

The pith

Label bias in segmentation can be detected by comparing training labels to a model's confident predictions and mitigated by leveraging the resulting subgroup separability in feature space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that group-conditional label errors in segmentation datasets can be isolated without clean ground truth by contrasting the provided annotations against the model's own high-confidence outputs on the training images. This comparison reveals directional errors that standard overlap measures overlook because they do not distinguish systematic subgroup mistakes from random noise. The work further demonstrates that such bias produces measurable separability between demographic subgroups inside the encoder's feature space, and turns this structure into a mitigation signal rather than treating it as noise to suppress. A reader would care because many practical segmentation tasks rely on imperfect annotations that create unequal performance across populations, and obtaining unbiased reference labels is rarely feasible at scale. The approach is shown to restore equitable results on both synthetic and real datasets by operating directly on the biased training collection.

Core claim

Adapting confident learning to segmentation allows directional label errors to be quantified by direct comparison of training labels with the model's confident predictions, exposing bias where Dice-style metrics remain insensitive. This bias in turn induces separable clusters for different subgroups within the encoder feature space, and the method exploits that separability as a constructive signal for mitigation instead of attempting to remove it.

What carries the argument

Directional error isolation obtained by contrasting provided training labels against confident model predictions, together with the induced subgroup separability in encoder feature space used as the lever for mitigation.

If this is right

Label bias becomes detectable and characterizable directly from the training set without any clean reference annotations.
Standard overlap metrics miss the directional, group-specific nature of the errors that the comparison isolates.
Feature-space subgroup separability can be used as a mitigation resource rather than suppressed.
Equitable performance across subgroups is achievable on both synthetic bias and real-life annotation bias without external clean data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same confident-prediction comparison might be applied to other dense prediction tasks where label noise is also group-dependent.
If the initial model is too strongly shaped by the biased labels, its confident outputs may no longer provide a trustworthy reference.
The separability signal could be combined with existing fairness regularizers to test whether further gains appear on real demographic data.

Load-bearing premise

The model's confident predictions are less biased than the original training labels and can therefore serve as a reliable reference for identifying directional errors.

What would settle it

On a dataset where unbiased ground truth is known, inject controlled group-conditional label noise, retrain, and check whether the confident predictions align more closely with the true labels than the noisy training labels do; if they do not, or if the mitigation step leaves subgroup performance gaps unchanged, the central claim fails.

Figures

Figures reproduced from arXiv: 2605.06891 by Aasa Feragen, Aditya Parikh, Sneha Das, Stella Frank.

**Figure 2.** Figure 2: CL decomposed annotation failures into omission and commission errors. The joint distribution Q localizes each error type in an off-diagonal matrix preserving the error direction, whereas Dice/IoU yield near-identical scores for both error types and over-penalize False Negatives. number of pixels that were observed as class j but confidently predicted by the model as class j ′ (yobs = j, yˆ = j ′ ), as wel… view at source ↗

**Figure 3.** Figure 3: Visual and statistical analysis of synthetic omission bias (morphological erosion) applied to [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: (Left) Example from the PhC-U373 dataset demonstrating group-conditional label bias (a) [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Representative samples from the ISIC 2017 dataset grouped by the mentioned skin tone [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

read the original abstract

Labeled datasets reflect the biases of their annotation pipelines, which sometimes introduce label bias: group-conditional label errors that cause systematic performance disparities across demographic subgroups. Label bias in image segmentation remains underexplored, as even detecting it typically requires clean, unbiased annotations, which are not readily available. We present a data-centric adaptation of Confident Learning to segmentation, allowing detection of label bias directly in the training data without a clean, unbiased ground truth. By comparing the provided training labels to the model's confident predictions, we isolate directional errors that quantify the presence and nature of bias, where standard overlap metrics like Dice fail. We further show that label bias influences subgroup separability in the encoder's feature space, an artifact we leverage for bias mitigation rather than suppressing it. We evaluate three datasets, spanning from synthetic to real-life bias, showing how our framework reliably detects and mitigates bias without access to clean labels, achieving equitable performance across experimental conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts Confident Learning to detect label bias in segmentation without clean labels and tries to use induced feature separability for mitigation, but the core comparison risks circularity since predictions come from a model trained on the biased data.

read the letter

The main takeaway is a data-centric method for spotting group-conditional label errors in segmentation by pitting the given annotations against a model's confident outputs, then flipping the resulting feature-space separation into a mitigation step. This avoids needing fresh clean labels, which matters for real applications where re-annotating is expensive. They correctly note that Dice-style metrics miss directional bias, and the synthetic-to-real dataset sweep is a reasonable way to test generality. Adapting Confident Learning keeps the detection step grounded in an established idea rather than inventing new machinery from scratch. That part feels practical and worth checking further. The soft spot is the assumption that the model's confident predictions are systematically less biased than the training labels in a subgroup-specific sense. Because the model is optimized directly on the biased annotations, any group-conditional error patterns can be reproduced in the predictions, which undercuts the claim that the comparison cleanly isolates true label bias. The abstract states they achieve equitable performance across conditions but supplies no numbers, ablations, or error breakdowns, so it is impossible to tell whether the method escapes this loop or simply inherits it. The mitigation step via feature separability carries the same risk. This is for fairness researchers working on medical or high-stakes segmentation where annotation pipelines introduce demographic skew. It deserves peer review because the problem is concrete, the approach is implementable without extra labels, and the circularity concern is testable with the right experiments. A referee can push for the missing quantitative results and a direct check on whether predictions actually serve as a cleaner reference.

Referee Report

1 major / 2 minor

Summary. The manuscript presents a framework for detecting and mitigating label bias in image segmentation tasks. The approach adapts Confident Learning to segmentation by using a model's confident predictions (trained on potentially biased labels) to identify directional errors in the training labels without needing clean ground truth. It further analyzes the impact of label bias on feature space separability and proposes to leverage this for mitigation. The method is evaluated on three datasets with varying bias types, claiming to achieve equitable performance across subgroups.

Significance. Should the central claims hold, this work would be significant for the field of fair machine learning in computer vision. It addresses a practical challenge in segmentation where clean labels are scarce, offering a way to quantify bias where standard metrics like Dice fail and to mitigate it by exploiting rather than ignoring feature separability. This could influence how biased datasets are handled in applications such as medical image analysis.

major comments (1)

[Method (adaptation of Confident Learning)] The detection step (described in the abstract as comparing training labels to the model's confident predictions) treats predictions from a model trained on the biased labels as a less-biased reference for isolating directional errors. This assumption is load-bearing for the central claim of detecting label bias without clean ground truth, yet no analysis shows that the predictions have lower group-conditional error rates than the labels; the model may reproduce the same subgroup-specific errors, rendering the comparison circular.

minor comments (2)

The abstract supplies no quantitative results, specific metrics, or ablation details, which hinders immediate assessment of the magnitude of bias detection and mitigation gains.
[Method] Formal equations defining the directional error quantification and the feature-space separability metric would improve reproducibility and clarity in the method description.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and insightful review. The concern regarding the core assumption in our adaptation of Confident Learning is well-taken, and we address it directly below. We will revise the manuscript to include additional analysis and discussion that strengthens the methodological justification.

read point-by-point responses

Referee: The detection step (described in the abstract as comparing training labels to the model's confident predictions) treats predictions from a model trained on the biased labels as a less-biased reference for isolating directional errors. This assumption is load-bearing for the central claim of detecting label bias without clean ground truth, yet no analysis shows that the predictions have lower group-conditional error rates than the labels; the model may reproduce the same subgroup-specific errors, rendering the comparison circular.

Authors: We agree that a direct demonstration of lower group-conditional error rates in the model's confident predictions relative to the training labels would strengthen the claim and reduce the risk of circularity. The method inherits the core premise of Confident Learning that high-confidence predictions can identify systematic label errors even under noise, but we acknowledge that this premise requires explicit support in the segmentation setting. Our evaluation includes synthetic datasets in which the ground-truth labels are known by construction; on these datasets we verify that the confident predictions exhibit measurably lower subgroup-specific error rates than the injected biased labels, confirming that the comparison isolates the directional errors rather than merely reproducing them. For the real-world datasets, where clean labels are unavailable by design, we instead validate the detection step indirectly through the downstream mitigation results, which produce equitable performance gains consistent with the detected bias directions. In the revised manuscript we will add a dedicated subsection that reports the group-conditional error comparison on the synthetic data, discusses the conditions under which the assumption holds (drawing on the noisy-label literature), and clarifies the distinction between synthetic verification and real-data utility. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method relies on external model predictions and assumptions without self-referential reduction

full rationale

The provided abstract and description outline an adaptation of Confident Learning (external method) that compares training labels to a model's confident predictions to isolate directional errors. No equations, derivations, or self-citations are shown that would create a load-bearing loop, self-definition, or fitted-input-called-prediction. The central claim depends on the unverified assumption that predictions are less biased than labels, but this is an external premise rather than a reduction by construction to the paper's own inputs. No uniqueness theorems, ansatzes smuggled via citation, or renaming of known results appear. The derivation chain remains self-contained against external benchmarks and does not reduce to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that confident model predictions are less biased than the given labels and can therefore be used as a reference to detect directional errors; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Confident predictions from a trained segmentation model can serve as a less-biased reference than the provided training labels for detecting directional label errors
This assumption is required for the Confident Learning adaptation to isolate bias without clean ground truth.

pith-pipeline@v0.9.0 · 5469 in / 1466 out tokens · 65048 ms · 2026-05-11T01:04:50.836927+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

[1]

Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, and Allan Halpern. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging co...

work page 2017
[2]

Jessica Dai and Sarah M Brown

doi: 10.1109/ISBI.2018.8363547. Jessica Dai and Sarah M Brown. Label bias, label shift: Fair machine learning with unreliable labels. InNeurIPS 2020 Workshop on Consequential Decision Making in Dynamic Environments, volume 12,

work page doi:10.1109/isbi.2018.8363547 2018
[3]

In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI)

doi: 10.1109/SSCI47803.2020.9308585. Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning.Advances in neural information processing systems, 29,

work page doi:10.1109/ssci47803.2020.9308585 2020
[4]

Detecting labeling bias using influence functions

Frida Jørgensen, Nina Weng, and Siavash Bigdeli. Detecting labeling bias using influence functions. arXiv preprint arXiv:2602.19130,

work page arXiv
[5]

Estimating label quality and errors in semantic segmentation data via any model,

Vedang Lad and Jonas Mueller. Estimating label quality and errors in semantic segmentation data via any model.arXiv preprint arXiv:2307.05080,

work page arXiv
[6]

Noise-robust medical image segmentation via uncertainty-guided feature enhancement and adaptive noise-aware loss.Available at SSRN 6017137

Jinpeng Li and Han Wang. Noise-robust medical image segmentation via uncertainty-guided feature enhancement and adaptive noise-aware loss.Available at SSRN 6017137. Yunyi Li, Maria De-Arteaga, and Maytal Saar-Tsechansky. Mitigating label bias via decoupled confident learning.arXiv preprint arXiv:2307.08945,

work page arXiv
[7]

Investigating label bias and representational sources of age-related disparities in medical segmentation.arXiv preprint arXiv:2511.00477,

Aditya Parikh, Sneha Das, and Aasa Feragen. Investigating label bias and representational sources of age-related disparities in medical segmentation.arXiv preprint arXiv:2511.00477,

work page arXiv
[8]

Making deep neural networks robust to label noise: A loss correction approach

20 Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1944–1952,

work page 1944
[9]

Are demographically invariant models and representations in medical imaging fair?arXiv preprint arXiv:2305.01397,

Eike Petersen, Enzo Ferrante, Melanie Ganz, and Aasa Feragen. Are demographically invariant models and representations in medical imaging fair?arXiv preprint arXiv:2305.01397,

work page arXiv
[10]

Common limitations of image processing metrics: A picture story

Annika Reinke, Minu D Tizabi, Carole H Sudre, Matthias Eisenmann, Tim Rädsch, Michael Baum- gartner, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, et al. Common limitations of image processing metrics: A picture story.arXiv preprint arXiv:2104.05642,

work page arXiv
[11]

Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density

Emma AM Stanley, Raghav Mehta, and Mélanie Roschewitz. Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density. InFairness of AI in Medical Imaging: Third International Workshop, FAIMI 2025, Held in Conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings, page

work page 2025
[12]

That label’s got style: Handling label style bias for uncertain image segmentation.arXiv preprint arXiv:2303.15850,

Kilian Zepf, Eike Petersen, Jes Frellsen, and Aasa Feragen. That label’s got style: Handling label style bias for uncertain image segmentation.arXiv preprint arXiv:2303.15850,

work page arXiv
[13]

Mitigating unwanted biases with adversarial learning

Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning. InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340,

work page 2018
[14]

De-biased representation learning for fairness with unreliable labels.arXiv preprint arXiv:2208.00651,

Yixuan Zhang, Feng Zhou, Zhidong Li, Yang Wang, and Fang Chen. De-biased representation learning for fairness with unreliable labels.arXiv preprint arXiv:2208.00651,

work page arXiv

[1] [1]

Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, and Allan Halpern. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging co...

work page 2017

[2] [2]

Jessica Dai and Sarah M Brown

doi: 10.1109/ISBI.2018.8363547. Jessica Dai and Sarah M Brown. Label bias, label shift: Fair machine learning with unreliable labels. InNeurIPS 2020 Workshop on Consequential Decision Making in Dynamic Environments, volume 12,

work page doi:10.1109/isbi.2018.8363547 2018

[3] [3]

In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI)

doi: 10.1109/SSCI47803.2020.9308585. Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning.Advances in neural information processing systems, 29,

work page doi:10.1109/ssci47803.2020.9308585 2020

[4] [4]

Detecting labeling bias using influence functions

Frida Jørgensen, Nina Weng, and Siavash Bigdeli. Detecting labeling bias using influence functions. arXiv preprint arXiv:2602.19130,

work page arXiv

[5] [5]

Estimating label quality and errors in semantic segmentation data via any model,

Vedang Lad and Jonas Mueller. Estimating label quality and errors in semantic segmentation data via any model.arXiv preprint arXiv:2307.05080,

work page arXiv

[6] [6]

Noise-robust medical image segmentation via uncertainty-guided feature enhancement and adaptive noise-aware loss.Available at SSRN 6017137

Jinpeng Li and Han Wang. Noise-robust medical image segmentation via uncertainty-guided feature enhancement and adaptive noise-aware loss.Available at SSRN 6017137. Yunyi Li, Maria De-Arteaga, and Maytal Saar-Tsechansky. Mitigating label bias via decoupled confident learning.arXiv preprint arXiv:2307.08945,

work page arXiv

[7] [7]

Investigating label bias and representational sources of age-related disparities in medical segmentation.arXiv preprint arXiv:2511.00477,

Aditya Parikh, Sneha Das, and Aasa Feragen. Investigating label bias and representational sources of age-related disparities in medical segmentation.arXiv preprint arXiv:2511.00477,

work page arXiv

[8] [8]

Making deep neural networks robust to label noise: A loss correction approach

20 Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1944–1952,

work page 1944

[9] [9]

Are demographically invariant models and representations in medical imaging fair?arXiv preprint arXiv:2305.01397,

Eike Petersen, Enzo Ferrante, Melanie Ganz, and Aasa Feragen. Are demographically invariant models and representations in medical imaging fair?arXiv preprint arXiv:2305.01397,

work page arXiv

[10] [10]

Common limitations of image processing metrics: A picture story

Annika Reinke, Minu D Tizabi, Carole H Sudre, Matthias Eisenmann, Tim Rädsch, Michael Baum- gartner, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, et al. Common limitations of image processing metrics: A picture story.arXiv preprint arXiv:2104.05642,

work page arXiv

[11] [11]

Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density

Emma AM Stanley, Raghav Mehta, and Mélanie Roschewitz. Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density. InFairness of AI in Medical Imaging: Third International Workshop, FAIMI 2025, Held in Conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings, page

work page 2025

[12] [12]

That label’s got style: Handling label style bias for uncertain image segmentation.arXiv preprint arXiv:2303.15850,

Kilian Zepf, Eike Petersen, Jes Frellsen, and Aasa Feragen. That label’s got style: Handling label style bias for uncertain image segmentation.arXiv preprint arXiv:2303.15850,

work page arXiv

[13] [13]

Mitigating unwanted biases with adversarial learning

Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning. InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340,

work page 2018

[14] [14]

De-biased representation learning for fairness with unreliable labels.arXiv preprint arXiv:2208.00651,

Yixuan Zhang, Feng Zhou, Zhidong Li, Yang Wang, and Fang Chen. De-biased representation learning for fairness with unreliable labels.arXiv preprint arXiv:2208.00651,

work page arXiv