pith. sign in

arxiv: 2511.17146 · v3 · submitted 2025-11-21 · 💻 cs.CV

Learning to Look Closer: A New Instance-Wise Loss for Small Cerebral Lesion Segmentation

Pith reviewed 2026-05-17 20:46 UTC · model grok-4.3

classification 💻 cs.CV
keywords instance-wise losssmall lesion segmentationcerebral lesionsCC-DiceCEmedical image segmentationblob lossnnU-Netconnected component metrics
0
0 comments X

The pith

CC-DiceCE loss raises detection recall for small cerebral lesions with little impact on segmentation quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CC-DiceCE, a loss function built on the CC-Metrics framework that scores segmentation quality separately for each lesion rather than over the whole image. Standard losses such as Dice give small lesions negligible weight because of their tiny volume, so the new loss is tested against a DiceCE baseline and the existing blob loss inside a standardized nnU-Net setup on multiple datasets. Results show higher lesion detection rates while segmentation performance stays largely the same, with only dataset-specific changes in precision, and CC-DiceCE generally beats blob loss. A reader would care because missing small lesions in brain scans can delay diagnosis and treatment.

Core claim

CC-DiceCE loss, based on the CC-Metrics framework, increases detection (recall) with minimal to no degradation in segmentation performance compared to a DiceCE baseline, though with dataset-dependent trade-offs in precision, and our multi-dataset study shows that CC-DiceCE generally outperforms blob loss.

What carries the argument

The CC-DiceCE loss function, which uses connected-component metrics to evaluate and penalize segmentation errors on a per-lesion basis.

If this is right

  • CC-DiceCE increases lesion detection recall compared with the DiceCE baseline.
  • Segmentation performance experiences minimal or no degradation.
  • Precision shows dataset-dependent trade-offs.
  • CC-DiceCE outperforms blob loss across the tested datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same per-lesion loss approach could be tested on small-object segmentation tasks outside cerebral imaging, such as lung nodules or retinal vessels.
  • Models trained with CC-DiceCE might generalize better to rare or tiny structures when data are highly imbalanced.
  • Combining CC-DiceCE with post-processing steps that merge or filter connected components could further reduce false positives without retraining.

Load-bearing premise

That evaluating segmentation on a per-lesion basis reliably improves clinical usefulness and that nnU-Net comparisons remain fair without hidden dataset-specific effects.

What would settle it

A replication on the same datasets in which CC-DiceCE produces no gain in recall or a clear drop in Dice scores relative to the DiceCE baseline would falsify the central claim.

read the original abstract

Traditional loss functions in medical image segmentation, such as Dice, often under-segment small lesions because their small relative volume contributes negligibly to the overall loss. To address this, instance-wise loss functions and metrics have been proposed to evaluate segmentation quality on a per-lesion basis. We introduce CC-DiceCE, a loss function based on the CC-Metrics framework, and compare it with the existing blob loss. Both are benchmarked against a DiceCE baseline within the nnU-Net framework, which provides a robust and standardized setup. We find that CC-DiceCE loss increases detection (recall) with minimal to no degradation in segmentation performance, though with dataset-dependent trade-offs in precision. Furthermore, our multi-dataset study shows that CC-DiceCE generally outperforms blob loss.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CC-DiceCE, an instance-wise loss derived from the CC-Metrics framework, to mitigate under-segmentation of small cerebral lesions by emphasizing per-lesion detection. It benchmarks CC-DiceCE against a DiceCE baseline and the blob loss within the standardized nnU-Net framework across multiple datasets, reporting gains in lesion recall with little or no Dice degradation and general outperformance over blob loss, albeit with dataset-dependent precision trade-offs.

Significance. If the reported recall improvements prove robust, the work could provide a practical tool for clinical tasks where missing small lesions is costly. The choice of nnU-Net supplies a reproducible baseline that strengthens cross-loss comparisons. The empirical focus and multi-dataset evaluation are appropriate for the claim, though the absence of statistical tests or variance estimates reduces the strength of the performance assertions.

major comments (2)
  1. [Results / Experiments] Results section (and abstract): recall and Dice values are given as single-run point estimates with no standard deviations, confidence intervals, or repeated random seeds. Because nnU-Net training incorporates stochastic augmentation and initialization, and small-lesion recall is known to be sensitive to these factors, the observed deltas cannot be distinguished from training noise without additional runs or statistical tests.
  2. [Results] Table(s) reporting per-dataset metrics: the claim that CC-DiceCE 'generally outperforms blob loss' is qualified by 'dataset-dependent trade-offs in precision,' yet no quantitative measure of consistency (e.g., win rate across datasets or lesion-size strata) is supplied to support the 'generally' qualifier.
minor comments (2)
  1. [Abstract] Abstract: exact dataset sizes, number of lesions, and lesion-size definitions are omitted; these details should appear at least in the first results table or methods paragraph.
  2. [Methods] Notation: the precise mathematical definition of CC-DiceCE (how the CC-Metrics per-lesion scores are folded into the DiceCE term) should be given an equation number and contrasted explicitly with blob loss.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful and constructive comments. We address each of the major comments below and describe the revisions planned for the manuscript.

read point-by-point responses
  1. Referee: [Results / Experiments] Results section (and abstract): recall and Dice values are given as single-run point estimates with no standard deviations, confidence intervals, or repeated random seeds. Because nnU-Net training incorporates stochastic augmentation and initialization, and small-lesion recall is known to be sensitive to these factors, the observed deltas cannot be distinguished from training noise without additional runs or statistical tests.

    Authors: We agree that single-run results limit the ability to assess variability due to stochastic elements in nnU-Net training. To address this, we will rerun the experiments with multiple random seeds for each configuration and report means along with standard deviations for recall, Dice, and precision metrics. We will also consider adding statistical significance tests in the revised manuscript. revision: yes

  2. Referee: [Results] Table(s) reporting per-dataset metrics: the claim that CC-DiceCE 'generally outperforms blob loss' is qualified by 'dataset-dependent trade-offs in precision,' yet no quantitative measure of consistency (e.g., win rate across datasets or lesion-size strata) is supplied to support the 'generally' qualifier.

    Authors: We acknowledge the value of a quantitative consistency measure to support the 'generally outperforms' statement. In the revision, we will add a summary analysis computing win rates for CC-DiceCE versus blob loss across the evaluated datasets and lesion-size groups for the primary metrics. This will provide objective support while retaining the discussion of precision trade-offs. revision: yes

Circularity Check

0 steps flagged

Empirical loss comparison with no reduction to self-defined quantities

full rationale

The paper introduces the CC-DiceCE loss by building on the CC-Metrics framework and reports empirical performance gains versus DiceCE and blob loss baselines inside the nnU-Net pipeline across multiple datasets. All central claims (recall increase with limited Dice degradation, general outperformance of blob loss) rest on observed training/evaluation metrics rather than any derivation, prediction, or uniqueness result that reduces by the paper's own equations to fitted parameters or prior self-citations. No load-bearing step collapses to a self-definitional or fitted-input pattern; the work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The work rests on the domain assumption that nnU-Net supplies a fair standardized benchmark and on the introduction of the CC-DiceCE loss itself; no additional free parameters or invented physical entities are required beyond standard training hyperparameters.

axioms (1)
  • domain assumption nnU-Net framework provides a robust and standardized setup for fair comparison
    Explicitly invoked to justify the benchmarking protocol.
invented entities (1)
  • CC-DiceCE loss no independent evidence
    purpose: Instance-wise loss combining connected-component metrics with Dice and cross-entropy for small-lesion focus
    Newly defined loss function introduced in the paper.

pith-pipeline@v0.9.0 · 5441 in / 1237 out tokens · 46551 ms · 2026-05-17T20:46:39.449780+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    CC-Metrics computes the Voronoi region for every connected component (lesion) and then scores each region individually. This assigns the same weight to every lesion, regardless of size, in the final score. Vm(P, K) = 1/|K| ∑_{C∈K} m(P ∩ R_C, C)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Instance Awareness of Multi-class Semantic Segmentation Loss Functions

    cs.CV 2026-04 unverdicted novelty 6.0

    Multi-class blob and CC losses via one-vs-rest decomposition and per-component weighting improve foreground Dice, rare-class Dice, and Panoptic Quality on BraTS-METS 2025 compared to baseline.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    How- ever, conventional voxel-overlap losses, such as the stan- dard Dice with cross-entropy (DiceCE) used in nnU-Net [2], overweight large structures

    INTRODUCTION Automated segmentation of small cerebral lesions in brain MRI enables scalable detection and quantification [1]. How- ever, conventional voxel-overlap losses, such as the stan- dard Dice with cross-entropy (DiceCE) used in nnU-Net [2], overweight large structures. This can reduce small-lesion detection when lesion sizes vary widely [3, 4, 5],...

  2. [2]

    baseline (Dice0.659) is substantially lower than standard nnU-Net performance (Dice0.801) [3, 9]. This concern is echoed by [9], which notes that many segmentation studies fail to configure baselines properly or evaluate on too few datasets, which is a notable problem given the high hetero- geneity of medical data. We address these limitations by us- ing ...

  3. [3]

    We investigate the potential of CC-Metrics as a loss func- tion for small-lesion segmentation

  4. [4]

    Learning to Look Closer: A New Instance-Wise Loss for Small Cerebral Lesion Segmentation

    We provide a rigorous evaluation of instance-aware losses (CC-Metrics and blob loss) against a strong, standard- ized baseline (nnU-Net) across multiple heterogeneous datasets. The code for the experiments can be found at https://github.com/TIO-IKIM/Learning-to-Look-Closer. © 2026 IEEE. Personal use of this material is permitted. Permission from IEEE must...

  5. [5]

    Where is V ALDO?

    RELA TED WORK 2.1. Losses LetT⊂Z 3 be the lattice,K⊂Tthe ground-truth fore- ground,Kthe set of its maximal 26-connected components, andm:P(T)× P(T)→Ra base set metric (e.g., Dice). For t∈TandA⊆T, defined(t, A) = min u∈A ∥t−u∥ 2. For eachC∈ K, the V oronoi region is (ties broken arbitrarily) RC = t∈T:d(t, C)< d(t, C ′),∀C ′ ∈ K\{C} .(1) CC-Metrics computes...

  6. [6]

    Training We evaluated all loss functions within the nnU-Net frame- work

    EXPERIMENTAL SETUP 3.1. Training We evaluated all loss functions within the nnU-Net frame- work. We retain all default nnU-Net parameters, with one exception: we use the non-smooth variant of Dice loss with ϵ= 0, as we observed training instability on the CMB dataset with the default smooth Dice. The base metricmfor the instance-wise functions is DiceCE. ...

  7. [7]

    RESULTS As summarized in Tab. 2, replacing the baseline DiceCE with CC-DiceCE maintained the global Dice score within the typical 5-fold variation for each cohort, while improving CC-Dice and recall in most of them. On LAC and CMB, CC-DiceCE increased lesion-wise per- formance (higher CC-Dice and F1) with a trade-off in global Dice on LAC and a consistent...

  8. [8]

    We also observe increases in CC-Dice in four of five datasets, with a small decrease only on WMH

    DISCUSSION We find that CC-DiceCE improves detection rates (recall) while the change in segmentation performance (Dice) is min- imal (at worst−0.011) across all five datasets. We also observe increases in CC-Dice in four of five datasets, with a small decrease only on WMH. We hypothesize that the inclu- sion of the global DiceCE loss term helps maintain t...

  9. [9]

    CONCLUSION We studied instance-aware objectives for small cerebral le- sion segmentation within a strong and standardized nnU-Net setup across five heterogeneous MRI cohorts. Replacing con- ventional DiceCE with CC-DiceCE consistently improved instance-aware detection (higher recall and CC-Dice) in four of five datasets while having negligible effect on g...

  10. [10]

    Ethical approval was not required as confirmed by the license attached with the open access data

    COMPLIANCE WITH ETHICAL STANDARDS This research study was conducted retrospectively using hu- man subject data made available in open access [11, 12, 6, 13, 14, 15, 16]. Ethical approval was not required as confirmed by the license attached with the open access data

  11. [11]

    The brain tumor segmentation- metastases (brats-mets) challenge 2023: Brain metas- tasis segmentation on pre-treatment mri,

    Ahmed W Moawad, Anastasia Janas, Ujjwal Baid, Divya Ramakrishnan, Rachit Saluja, Nader Ashraf, Nazanin Maleki, Leon Jekel, Nikolay Yordanov, Pas- cal Fehringer, et al., “The brain tumor segmentation- metastases (brats-mets) challenge 2023: Brain metas- tasis segmentation on pre-treatment mri,”ArXiv, pp. arXiv–2306, 2024

  12. [12]

    nnu-net: a self- configuring method for deep learning-based biomedical image segmentation,

    Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Petersen, and Klaus H Maier-Hein, “nnu-net: a self- configuring method for deep learning-based biomedical image segmentation,”Nature methods, vol. 18, no. 2, pp. 203–211, 2021

  13. [13]

    Blob loss: Instance imbalance aware loss functions for semantic segmentation,

    Florian Kofler, Suprosanna Shit, Ivan Ezhov, Lucas Fi- don, Izabela Horvath, Rami Al-Maskari, Hongwei Bran Li, Harsharan Bhatia, Timo Loehr, Marie Piraud, et al., “Blob loss: Instance imbalance aware loss functions for semantic segmentation,” inInternational Conference on Information Processing in Medical Imaging. Springer, 2023, pp. 755–767

  14. [14]

    Every component counts: Rethinking the measure of success for medical semantic segmentation in multi-instance segmentation tasks,

    Alexander Jaus, Constantin Marc Seibold, Simon Reiß, Zdravko Marinov, Keyi Li, Zeling Ye, Stefan Krieg, Jens Kleesiek, and Rainer Stiefelhagen, “Every component counts: Rethinking the measure of success for medical semantic segmentation in multi-instance segmentation tasks,” inProceedings of the AAAI Conference on Arti- ficial Intelligence, 2025, vol. 39,...

  15. [15]

    A new family of instance-level loss func- tions for improving instance-level segmentation and de- tection of white matter hyperintensities in routine clini- cal brain mri,

    Muhammad Febrian Rachmadi, Michal Byra, and Hen- rik Skibbe, “A new family of instance-level loss func- tions for improving instance-level segmentation and de- tection of white matter hyperintensities in routine clini- cal brain mri,”Computers in Biology and Medicine, vol. 174, pp. 108414, 2024

  16. [16]

    Deep learning enables automatic detection and segmentation of brain metastases on multisequence mri,

    Endre Grøvik, Darvin Yi, Michael Iv, Elizabeth Tong, Daniel Rubin, and Greg Zaharchuk, “Deep learning enables automatic detection and segmentation of brain metastases on multisequence mri,”Journal of Magnetic Resonance Imaging, vol. 51, no. 1, pp. 175–182, 2020

  17. [17]

    Improving segmentation of objects with varying sizes in biomedical images using instance-wise and center-of- instance segmentation loss function,

    Febrian Rachmadi, Charissa Poon, and Henrik Skibbe, “Improving segmentation of objects with varying sizes in biomedical images using instance-wise and center-of- instance segmentation loss function,” inMedical Imag- ing with Deep Learning. PMLR, 2024, pp. 286–300

  18. [18]

    The liver tumor segmenta- tion benchmark (lits),

    Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene V orontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Sze- skin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, et al., “The liver tumor segmenta- tion benchmark (lits),”Medical image analysis, vol. 84, pp. 102680, 2023

  19. [19]

    nnu-net revisited: A call for rig- orous validation in 3d medical image segmentation,

    Fabian Isensee, Tassilo Wald, Constantin Ulrich, Michael Baumgartner, Saikat Roy, Klaus Maier-Hein, and Paul F Jaeger, “nnu-net revisited: A call for rig- orous validation in 3d medical image segmentation,” inInternational Conference on Medical Image Com- puting and Computer-Assisted Intervention. Springer, 2024, pp. 488–498

  20. [20]

    Incident cerebral lacunes: a review,

    Yifeng Ling and Hugues Chabriat, “Incident cerebral lacunes: a review,”Journal of Cerebral Blood Flow & Metabolism, vol. 40, no. 5, pp. 909–921, 2020

  21. [21]

    Where is valdo? vascular lesions detection and segmentation challenge at miccai 2021,

    Carole H Sudre, Kimberlin Van Wijnen, Florian Du- bost, Hieab Adams, David Atkinson, Frederik Barkhof, Mahlet A Birhanu, Esther E Bron, Robin Camarasa, Nish Chaturvedi, et al., “Where is valdo? vascular lesions detection and segmentation challenge at miccai 2021,”Medical Image Analysis, vol. 91, pp. 103029, 2024

  22. [22]

    Standardized assessment of au- tomatic segmentation of white matter hyperintensities and results of the wmh segmentation challenge,

    Hugo J Kuijf, J Matthijs Biesbroek, Jeroen De Bresser, Rutger Heinen, Simon Andermatt, Mariana Bento, Matt Berseth, Mikhail Belyaev, M Jorge Cardoso, Adria Casamitjana, et al., “Standardized assessment of au- tomatic segmentation of white matter hyperintensities and results of the wmh segmentation challenge,”IEEE transactions on medical imaging, vol. 38, ...

  23. [23]

    The multimodal brain tumor image seg- mentation benchmark (brats),

    Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, et al., “The multimodal brain tumor image seg- mentation benchmark (brats),”IEEE transactions on medical imaging, vol. 34, no. 10, pp. 1993–2024, 2014

  24. [24]

    Advancing the cancer genome atlas glioma mri collec- tions with expert segmentation labels and radiomic fea- tures,

    Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki, Justin S Kirby, John B Freymann, Keyvan Farahani, and Christos Davatzikos, “Advancing the cancer genome atlas glioma mri collec- tions with expert segmentation labels and radiomic fea- tures,”Scientific data, vol. 4, no. 1, pp. 1–13, 2017

  25. [25]

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    Spyridon Bakas, Mauricio Reyes, Andras Jakab, Ste- fan Bauer, Markus Rempfler, Alessandro Crimi, Rus- sell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, et al., “Identifying the best machine learning algorithms for brain tumor segmentation, pro- gression assessment, and overall survival prediction in the brats challenge,”arXiv preprint...

  26. [26]

    Segmentation labels for the pre-operative scans of the tcga-lgg collection,

    Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki, Justin Kirby, John Freymann, Keyvan Farahani, and Christos Davatzikos, “Segmentation labels for the pre-operative scans of the tcga-lgg collection,”The cancer imaging archive, 2017