Advanced Tumor Segmentation in PET/CT Imaging: A Training Strategy Study with nnU-Net for AutoPET III

Hussain Alasmawi

arxiv: 2605.08161 · v1 · submitted 2026-05-04 · 💻 cs.CV

Advanced Tumor Segmentation in PET/CT Imaging: A Training Strategy Study with nnU-Net for AutoPET III

Hussain Alasmawi This is my paper

Pith reviewed 2026-05-12 01:13 UTC · model grok-4.3

classification 💻 cs.CV

keywords tumor segmentationPET/CT imagingnnU-NetAutoPET challengedata augmentationintensity normalizationdeep learningmedical imaging

0 comments

The pith

Training strategies in nnU-Net raise PET/CT tumor segmentation Dice score to 0.80 and third place in AutoPET III.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests how intensity normalization, batch Dice optimization, and CraveMix augmentation affect nnU-Net performance on whole-body tumor segmentation in PET/CT scans. These choices are evaluated on the AutoPET III challenge data to see if they cut false positives and handle differences in lesion size, contrast, and location. The best combination reaches a Dice score of 0.80 on the preliminary test set. If the strategies hold up, automated segmentation becomes more dependable for disease assessment and treatment planning across varied scanners and tracers. The work also reports a third-place ranking in the challenge.

Core claim

Using the nnU-Net framework with a ResNet-based encoder as baseline, the authors systematically vary intensity normalization, batch Dice optimization, and CraveMix data augmentation. These adjustments reduce false positives and increase robustness to lesion variability in multi-center, multi-tracer settings. The strongest configuration reaches a Dice score of 0.80 on the preliminary test phase and places third in the AutoPET III challenge.

What carries the argument

nnU-Net with ResNet encoder plus intensity normalization, batch Dice loss optimization, and CraveMix augmentation that together steer training toward lower false positives and greater robustness across lesion types.

If this is right

Lower false positive rate in the output segmentations
Greater stability when lesions vary in size, contrast, and body location
Improved generalization across PET tracers and imaging centers
More consistent automated support for disease evaluation and treatment planning
Third-place ranking among AutoPET III submissions

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same strategy combinations could be tested on other whole-body imaging tasks such as lymphoma or metastasis detection
Clinical deployment would require prospective validation on streaming hospital data to confirm time savings over manual contours
The reported ranking invites direct comparison with the top two entries to isolate which training choices drove the gap
If the augmentation and normalization steps prove stable, they could serve as a reusable recipe for future nnU-Net challenges in hybrid imaging

Load-bearing premise

The chosen training strategies will generalize to unseen multi-center data and different PET tracers without substantial performance drop or overfitting to the training distribution.

What would settle it

A clear drop in Dice score below 0.70 when the same model is evaluated on a new external dataset collected at different centers with a different PET tracer.

read the original abstract

Tumor segmentation in whole-body PET/CT imaging is crucial for precise disease evaluation and treatment planning. However, it remains challenging due to variability in lesion size, contrast, and anatomical distribution. Relying on manual segmentation makes the process time-consuming and prone to intra- and inter-observer variability. This work presents a whole-body tumor segmentation method developed for the AutoPET III challenge, where the goal is to build models that generalize across tracers and multi-center data. We employ the nnU-Net framework with a ResNet-based encoder as our baseline and systematically investigate the impact of training strategies, including intensity normalization, batch dice optimization, and data augmentation using CraveMix. Our experiments show that these strategies significantly influence model performance, particularly in reducing false positives and improving robustness to lesion variability. The best-performing configuration achieves a Dice score of up to 0.80 on the preliminary test phase, and our method ranked third in the AutoPET III challenge. The code is publicly available here.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Third-place AutoPET III result with nnU-Net tweaks, but no ablations to show what the training changes actually added.

read the letter

This paper delivers a third-place finish in the AutoPET III challenge by running nnU-Net on whole-body PET/CT tumor segmentation. The core contribution is a working pipeline that hits 0.80 Dice on the preliminary test set, with public code that anyone can download and try. It focuses on generalization across tracers and centers, which matches the challenge rules exactly. That ranking and the released implementation are the concrete outputs worth noting. The authors start from the standard nnU-Net ResNet encoder and add intensity normalization, batch Dice loss, and CraveMix augmentation. They report that these choices reduce false positives and improve robustness to lesion variability. The practical angle is clear: they tuned for the exact dataset and evaluation protocol used in the challenge. What the paper does well is keep the method straightforward and reproducible. Releasing code lowers the barrier for others who want to benchmark against this entry or adapt the config. The claim that the strategies matter is plausible given the final ranking, but the evidence stays at the level of the best configuration only. The stress-test point holds: there are no ablation tables comparing the unmodified nnU-Net baseline against each added component on the same splits. Without those deltas, plus any error bars or statistical checks, it is hard to attribute the ranking specifically to normalization, the loss, or CraveMix rather than the base framework or data properties. Hyperparameter details also stay high-level. This work is for people already running segmentation challenges or tuning nnU-Net on PET/CT data. A reader who needs a quick starting point and public code will get some value. It does not introduce new architecture or theory, so it will not interest readers looking for first-principles advances. I would send it to peer review. The challenge result is verifiable and the code is available, so referees can check the claims directly. Revisions should add the missing ablations and basic variability stats to make the attribution clearer.

Referee Report

2 major / 2 minor

Summary. The paper presents a whole-body tumor segmentation method for the AutoPET III challenge based on the nnU-Net framework with a ResNet encoder. It systematically examines the effects of three training strategies—intensity normalization, batch Dice optimization, and CraveMix augmentation—claiming these choices significantly improve performance by reducing false positives and increasing robustness to lesion variability. The best configuration reaches a Dice score of 0.80 on the preliminary test phase and places third in the challenge; code is released publicly.

Significance. If the performance gains can be rigorously attributed to the listed strategies through controlled experiments, the work would supply useful empirical guidance for nnU-Net users facing multi-tracer, multi-center PET/CT data. The public code release is a clear positive that aids reproducibility.

major comments (2)

Abstract: the central claim that intensity normalization, batch Dice optimization, and CraveMix 'significantly influence model performance, particularly in reducing false positives' is unsupported by any ablation results. The manuscript reports only the final Dice score of 0.80 and the challenge ranking; it contains no tables or figures that compare baseline nnU-Net against each strategy (or their combinations) on the same validation split, nor any precision, false-positive counts, or per-component deltas.
The manuscript provides no quantitative details on hyperparameter values, data-split statistics, statistical significance tests, or error bars, preventing verification that the reported ranking is attributable to the proposed strategies rather than the base nnU-Net pipeline or dataset characteristics.

minor comments (2)

The CraveMix augmentation is mentioned without a reference or implementation details, which reduces clarity for readers unfamiliar with the method.
Exact training hyperparameters (learning rate, batch size, normalization parameters, etc.) and the precise composition of training/validation/test splits are not stated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We agree that the current manuscript lacks the necessary ablation studies and quantitative details to fully support the claims regarding the impact of the training strategies. We will revise the manuscript to include these elements for improved rigor and reproducibility.

read point-by-point responses

Referee: Abstract: the central claim that intensity normalization, batch Dice optimization, and CraveMix 'significantly influence model performance, particularly in reducing false positives' is unsupported by any ablation results. The manuscript reports only the final Dice score of 0.80 and the challenge ranking; it contains no tables or figures that compare baseline nnU-Net against each strategy (or their combinations) on the same validation split, nor any precision, false-positive counts, or per-component deltas.

Authors: We acknowledge that the abstract's claim is not supported by explicit ablation results in the current manuscript, which reports only the final performance and ranking. In the revised version, we will add a dedicated ablation study section with tables and figures comparing the baseline nnU-Net (with ResNet encoder) against configurations using intensity normalization, batch Dice loss, and CraveMix augmentation individually and in combination. These will be evaluated on the same validation split and will include additional metrics such as precision, false-positive counts, and qualitative examples demonstrating reduced false positives. revision: yes
Referee: The manuscript provides no quantitative details on hyperparameter values, data-split statistics, statistical significance tests, or error bars, preventing verification that the reported ranking is attributable to the proposed strategies rather than the base nnU-Net pipeline or dataset characteristics.

Authors: We agree that the absence of these details limits the ability to verify the contributions of the proposed strategies. In the revision, we will expand the methods and experimental sections to include: specific hyperparameter values for the nnU-Net training (e.g., learning rate, batch size, patch size), data-split statistics (number of cases per split, tracer and center distributions), any statistical significance tests performed, and error bars or standard deviations from repeated experiments or cross-validation where available. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is an empirical study applying the nnU-Net framework to PET/CT tumor segmentation for the AutoPET III challenge. It reports experimental outcomes (Dice scores up to 0.80 and third-place ranking) obtained by training on provided data and evaluating on an external preliminary test phase. No equations, derivations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations appear in the text. Claims about the influence of intensity normalization, batch Dice loss, and CraveMix augmentation rest on reported configurations rather than any closed-form reduction to the same inputs by construction. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work rests on the standard assumptions of the nnU-Net framework and supervised deep learning for segmentation; no new free parameters, axioms, or invented entities are introduced beyond routine training choices.

pith-pipeline@v0.9.0 · 5468 in / 1227 out tokens · 63917 ms · 2026-05-12T01:13:45.392160+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ the nnU-Net framework with a ResNet-based encoder as our baseline and systematically investigate the impact of training strategies, including intensity normalization, batch dice optimization, and data augmentation using CraveMix.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The best-performing configuration achieves a Dice score of up to 0.80 on the preliminary test phase

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

[1]

2022 , note=

nnU-Net for Automated Lesion Segmentation in Whole-body FDG-PET/CT , author=. 2022 , note=

work page 2022
[2]

arXiv preprint arXiv:2209.01112 , year=

Autopet challenge: Combining nn-unet with swin unetr augmented by maximum intensity projection classifier , author=. arXiv preprint arXiv:2209.01112 , year=

work page arXiv
[3]

arXiv preprint arXiv:2210.07490 , year=

Exploring vanilla u-net for lesion segmentation from whole-body fdg-pet/ct scans , author=. arXiv preprint arXiv:2210.07490 , year=

work page arXiv
[4]

nnU-Net for brain tumor segmentation , author=. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part II 6 , pages=. 2021 , organization=

work page 2020
[5]

NeuroImage , volume=

CarveMix: a simple data augmentation method for brain lesion segmentation , author=. NeuroImage , volume=. 2023 , publisher=

work page 2023
[6]

arXiv preprint arXiv:2309.13747 , year=

Look Ma, no code: fine tuning nnU-Net for the AutoPET II challenge by only adjusting its JSON plans , author=. arXiv preprint arXiv:2309.13747 , year=

work page arXiv
[7]

Nature methods , volume=

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation , author=. Nature methods , volume=. 2021 , publisher=

work page 2021
[8]

Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=

U-net: Convolutional networks for biomedical image segmentation , author=. Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=. 2015 , organization=

work page 2015
[9]

arXiv preprint arXiv:2404.09556 , year=

nnu-net revisited: A call for rigorous validation in 3d medical image segmentation , author=. arXiv preprint arXiv:2404.09556 , year=

work page arXiv
[10]

2022 , howpublished =

FDG-PET-CT-Lesions , author =. 2022 , howpublished =

work page 2022
[11]

Nature Machine Intelligence , volume =

Results from the autoPET challenge on fully automated lesion segmentation in whole-body FDG-PET/CT , author =. Nature Machine Intelligence , volume =. 2024 , doi =

work page 2024

[1] [1]

2022 , note=

nnU-Net for Automated Lesion Segmentation in Whole-body FDG-PET/CT , author=. 2022 , note=

work page 2022

[2] [2]

arXiv preprint arXiv:2209.01112 , year=

Autopet challenge: Combining nn-unet with swin unetr augmented by maximum intensity projection classifier , author=. arXiv preprint arXiv:2209.01112 , year=

work page arXiv

[3] [3]

arXiv preprint arXiv:2210.07490 , year=

Exploring vanilla u-net for lesion segmentation from whole-body fdg-pet/ct scans , author=. arXiv preprint arXiv:2210.07490 , year=

work page arXiv

[4] [4]

nnU-Net for brain tumor segmentation , author=. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part II 6 , pages=. 2021 , organization=

work page 2020

[5] [5]

NeuroImage , volume=

CarveMix: a simple data augmentation method for brain lesion segmentation , author=. NeuroImage , volume=. 2023 , publisher=

work page 2023

[6] [6]

arXiv preprint arXiv:2309.13747 , year=

Look Ma, no code: fine tuning nnU-Net for the AutoPET II challenge by only adjusting its JSON plans , author=. arXiv preprint arXiv:2309.13747 , year=

work page arXiv

[7] [7]

Nature methods , volume=

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation , author=. Nature methods , volume=. 2021 , publisher=

work page 2021

[8] [8]

Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=

U-net: Convolutional networks for biomedical image segmentation , author=. Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=. 2015 , organization=

work page 2015

[9] [9]

arXiv preprint arXiv:2404.09556 , year=

nnu-net revisited: A call for rigorous validation in 3d medical image segmentation , author=. arXiv preprint arXiv:2404.09556 , year=

work page arXiv

[10] [10]

2022 , howpublished =

FDG-PET-CT-Lesions , author =. 2022 , howpublished =

work page 2022

[11] [11]

Nature Machine Intelligence , volume =

Results from the autoPET challenge on fully automated lesion segmentation in whole-body FDG-PET/CT , author =. Nature Machine Intelligence , volume =. 2024 , doi =

work page 2024