arxiv: 2604.09062 · v1 · submitted 2026-04-10 · 💻 cs.CV

Recognition: unknown

Nested Radially Monotone Polar Occupancy Estimation: Clinically-Grounded Optic Disc and Cup Segmentation for Glaucoma Screening

Rimsa Goperma , Rojan Basnet , Liang Zhao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:56 UTC · model grok-4.3

classification 💻 cs.CV

keywords optic disc segmentationoptic cup segmentationglaucoma screeningpolar occupancystar-convexityfundus imagesdeep learning

0 comments

The pith

Representing optic disc and cup segmentation as nested polar occupancy guarantees clinically valid shapes for glaucoma screening.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that optic disc and optic cup segmentation can be recast as nested radially monotone polar occupancy estimation to enforce star-convexity and nesting by construction. Standard deep learning methods frequently produce invalid shapes that corrupt clinical metrics such as vertical cup-to-disc ratio, especially when models are tested on new datasets. The new representation is shown to preserve or improve boundary accuracy while delivering perfect anatomical validity and substantial gains in Dice scores and error reduction on multiple fundus image collections.

Core claim

NPS-Net is the first framework that formulates the OD/OC segmentation as nested radially monotone polar occupancy estimation. This output representation can guarantee the clinical validness including star-convexity and nested structure of OD and OC and achieve high accuracy.

What carries the argument

Nested radially monotone polar occupancy estimation, a polar-coordinate representation that enforces star-convex boundaries and one region inside the other by design.

If this is right

Maintains 100 percent anatomical validity on the RIM-ONE dataset
Raises cup Dice by 12.8 percent absolute over the strongest baseline
Cuts vertical cup-to-disc ratio mean absolute error by more than 56 percent
Shows strong zero-shot generalization across seven public fundus datasets
Achieves Disc Dice of 0.9438 and Hausdorff distance of 2.78 pixels on PAPILA

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same polar occupancy format could be adapted to other medical segmentation problems that require convexity or containment constraints
Clinical software pipelines might be simplified by removing post-processing steps that currently repair invalid shapes
The representation may allow direct incorporation of geometric priors into network training for shape-sensitive tasks beyond ophthalmology

Load-bearing premise

Converting the segmentation task into prediction of nested radially monotone polar occupancy will let a neural network recover accurate boundaries while automatically satisfying the required geometric constraints.

What would settle it

A counter-example in which the network outputs a shape that violates star-convexity or nesting despite using the polar occupancy representation, or where segmentation accuracy falls below an unconstrained baseline on a dataset containing many non-convex examples.

Figures

Figures reproduced from arXiv: 2604.09062 by Liang Zhao, Rimsa Goperma, Rojan Basnet.

**Figure 2.** Figure 2: Illustration of monotone occupancy and factorized nesting. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative comparison on RIM-ONE (zero-shot) including best cases (top), worst cases (middle), and extreme cases (bottom): Disc in green, cup in [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 6.** Figure 6: Angular rim profile rim(θ) = rd(θ) − rc(θ) for a representative RIM-ONE image. NPS-Net (dottet) tracks the ground truth (solid) with high fidelity, preserving the ISNT pattern. Direct rim supervision (16) ensures this is a first-class training objective. its specific contribution: the largest single-step Rim Corr improvement (0.43→0.52) occurs when the shape prior activates, indicating that angular fideli… view at source ↗

**Figure 5.** Figure 5: Predicted vs. ground-truth vCDR on RIM-ONE and Papila for [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Valid segmentation of the optic disc (OD) and optic cup (OC) from fundus photographs is essential for glaucoma screening. Unfortunately, existing deep learning methods do not guarantee clinical validness including star-convexity and nested structure of OD and OC, resulting corruption in diagnostic metric, especially under cross-dataset domain shift. To adress this issue, this paper proposed NPS-Net (Nested Polar Shape Network), the first framework that formulates the OD/OC segmentation as nested radially monotone polar occupancy estimation.This output representation can guarantee the aforementioned clinical validness and achieve high accuracy. Evaluated across seven public datasets, NPS-Net shows strong zero-shot generalization. On RIM-ONE, it maintains 100% anatomical validity and improves Cup Dice by 12.8% absolute over the best baseline, reducing vCDR MAE by over 56%. On PAPILA, it achieves Disc Dice of 0.9438 and Disc HD95 of 2.78 px, an 83% reduction over the best competing method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a polar occupancy representation to enforce clinical constraints in OD/OC segmentation but lacks ablations to confirm its role in the reported gains.

read the letter

The one thing to know is that this work replaces direct mask output with a nested radially monotone polar occupancy representation for segmenting the optic disc and cup. The goal is to enforce star-convexity and nesting by design so that the vertical cup-to-disc ratio stays clinically valid even when the model generalizes across datasets. The new element is this specific polar occupancy formulation, presented as the first for OD/OC segmentation. NPS-Net uses it to achieve 100% validity on RIM-ONE along with a 12.8% absolute gain in cup dice and a 56% reduction in vCDR mean absolute error. Results on PAPILA show competitive disc dice of 0.9438 and low Hausdorff distance. Testing on seven datasets overall supports the generalization claim. This approach has merit because existing methods often produce invalid shapes that corrupt the diagnostic metrics, especially under domain shift. Building the constraints into the output avoids post-hoc fixes and keeps the predictions anatomically plausible. The soft spots center on the lack of targeted validation for the representation. No ablation compares the same network with a standard mask head to show that the polar format is what drives the gains rather than other factors. The discretization into radial and angular bins might introduce bias or lose precision on irregular optic nerve head boundaries, and without an analysis of the mapping error from occupancy back to mask, it's hard to rule that out. The stress-test concern about needing to confirm non-degraded boundary accuracy is fair and not addressed in the visible details. This is aimed at the medical imaging community working on glaucoma screening and shape-aware segmentation. Readers who want models that respect clinical priors without sacrificing too much accuracy will get something useful from the multi-dataset evaluation. It is worth a serious referee because the clinical motivation is clear and the reported improvements are substantial enough to examine in detail. I would recommend sending it for peer review with requests for the missing ablations and fidelity checks on the polar representation.

Referee Report

2 major / 2 minor

Summary. The paper proposes NPS-Net, the first framework to formulate optic disc (OD) and optic cup (OC) segmentation from fundus photographs as nested radially monotone polar occupancy estimation. This output representation is claimed to guarantee clinical validity properties including star-convexity and nested OD/OC structure by construction while delivering high segmentation accuracy. Experiments across seven public datasets report strong zero-shot generalization, 100% anatomical validity on RIM-ONE, a 12.8% absolute Cup Dice improvement, over 56% reduction in vCDR MAE, and large gains in Disc Dice/HD95 on PAPILA.

Significance. If the central claim holds, the work would be significant for medical image segmentation by embedding anatomical constraints directly into the output space rather than relying on post-processing or loss terms. The reported cross-dataset robustness and perfect validity rate address a practical failure mode in glaucoma screening metrics. Credit is due for the explicit focus on clinically grounded validity guarantees and the scale of the multi-dataset evaluation.

major comments (2)

[Abstract and §4] Abstract and §4 (Experiments): The large reported gains (12.8% Cup Dice, 56% vCDR MAE reduction) are presented as resulting from the nested radially monotone polar occupancy representation, yet no ablation is described that applies an identical backbone and training regime to a standard pixel-wise mask output head. Without this isolation, it remains unclear whether the improvements stem from the representation itself or from other architectural choices, directly bearing on the claim that the polar formulation simultaneously enforces validity and preserves (or improves) boundary accuracy.
[§3] §3 (Method): The forward mapping from binary mask to polar occupancy vector and the inverse reconstruction must be shown to introduce negligible discretization error on real OD/OC shapes for the accuracy claim to hold. The manuscript should quantify the boundary error (e.g., via Hausdorff distance or Dice) introduced by the chosen angular/radial discretization on the training distribution; otherwise the central assertion that the representation “achieve[s] high accuracy” while guaranteeing validity rests on an unverified assumption.

minor comments (2)

[Abstract] Abstract: Typo in “To adress this issue”.
[§2 and §3] §2 and §3: The precise definition of “radially monotone” and the nesting constraint should be stated with an equation or pseudocode early in the method section for immediate clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the manuscript's claims regarding the contribution of the nested radially monotone polar occupancy representation.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): The large reported gains (12.8% Cup Dice, 56% vCDR MAE reduction) are presented as resulting from the nested radially monotone polar occupancy representation, yet no ablation is described that applies an identical backbone and training regime to a standard pixel-wise mask output head. Without this isolation, it remains unclear whether the improvements stem from the representation itself or from other architectural choices, directly bearing on the claim that the polar formulation simultaneously enforces validity and preserves (or improves) boundary accuracy.

Authors: We agree that the absence of an ablation isolating the output representation from other architectural and training choices leaves the source of the reported gains partially ambiguous. The current evaluation compares NPS-Net against published baselines but does not hold the backbone and optimization fixed while swapping only the head. In the revised manuscript we will add this controlled ablation, training an otherwise identical network with a conventional pixel-wise segmentation head under the same regime and reporting the resulting Dice, HD95, and validity metrics. This addition will directly test whether the nested radially monotone polar formulation is responsible for the observed improvements in accuracy and anatomical validity. revision: yes
Referee: [§3] §3 (Method): The forward mapping from binary mask to polar occupancy vector and the inverse reconstruction must be shown to introduce negligible discretization error on real OD/OC shapes for the accuracy claim to hold. The manuscript should quantify the boundary error (e.g., via Hausdorff distance or Dice) introduced by the chosen angular/radial discretization on the training distribution; otherwise the central assertion that the representation “achieve[s] high accuracy” while guaranteeing validity rests on an unverified assumption.

Authors: We concur that explicit quantification of discretization error is required to substantiate the claim that the polar representation simultaneously guarantees validity and maintains high boundary accuracy. Although the strong empirical results across seven datasets provide indirect support, we will add a dedicated analysis in the revised §3. Specifically, we will reconstruct binary masks from the polar occupancy vectors using the chosen angular and radial discretization, then compute Dice and Hausdorff distance between the reconstructed and original ground-truth masks on the training distributions. These statistics will be reported to demonstrate that the introduced error is negligible for clinically observed OD/OC shapes. revision: yes

Circularity Check

0 steps flagged

No circularity: representation chosen by design, not derived from fitted inputs

full rationale

The paper introduces nested radially monotone polar occupancy estimation as an explicit modeling choice for the output representation. This choice is stated to enforce star-convexity and nested structure by construction of the representation itself, rather than being derived from or fitted to the input data or predictions in a way that reduces tautologically. No equations or steps in the provided abstract or description show a prediction that is statistically forced by a prior fit, nor any load-bearing self-citation chain. The accuracy claims are presented as empirical results on datasets, not as first-principles derivations that collapse to the inputs. The formulation is therefore self-contained as an independent architectural decision.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Review performed on abstract only; full architecture, loss functions, and training details unavailable, so ledger is limited to claims extractable from the abstract.

axioms (2)

domain assumption Fundus photographs contain sufficient signal to learn polar occupancy maps for OD and OC
Implicit assumption required for any deep learning approach on this data modality
standard math Radially monotone occupancy in polar coordinates mathematically guarantees star-convexity and nesting
Stated property of the output representation used to enforce clinical validity

invented entities (1)

Nested radially monotone polar occupancy no independent evidence
purpose: Output representation that enforces star-convexity and nesting for OD/OC
New modeling construct introduced to replace standard segmentation outputs

pith-pipeline@v0.9.0 · 5485 in / 1541 out tokens · 54521 ms · 2026-05-10T16:56:09.913614+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Global prevalence of glaucoma and projections of glaucoma burden through 2040,

Y .-C. Thamet al., “Global prevalence of glaucoma and projections of glaucoma burden through 2040,”Ophthalmology, vol. 121, no. 11, pp. 2081–2090, 2014

2040
[2]

The number of people with glaucoma worldwide in 2010 and 2020,

H. A. Quigleyet al., “The number of people with glaucoma worldwide in 2010 and 2020,”Br . J. Ophthalmol., vol. 90, no. 3, pp. 262–267, 2006

2010
[3]

The pathophysiology and treatment of glaucoma,

R. N. Weinrebet al., “The pathophysiology and treatment of glaucoma,” JAMA, vol. 311, no. 18, pp. 1901–1911, 2014

1901
[4]

Evaluation of retinal nerve fiber layer, optic disc, and macular thickness measurements for glaucoma detection using optical coherence tomography,

F. A. Medeiroset al., “Evaluation of retinal nerve fiber layer, optic disc, and macular thickness measurements for glaucoma detection using optical coherence tomography,”Am. J. Ophthalmol., vol. 139, no. 1, pp. 44–55, 2005

2005
[5]

Sparse dissimilarity-constrained coding for glaucoma screening,

J. Chenget al., “Sparse dissimilarity-constrained coding for glaucoma screening,”IEEE Trans. Biomed. Eng., vol. 62, no. 5, pp. 1395–1403, 2015

2015
[6]

REFUGE challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs,

J. I. Orlandoet al., “REFUGE challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs,”Med. Image Anal., vol. 59, p. 101570, 2020

2020
[7]

Drishti-GS: Retinal image dataset for optic nerve head segmentation,

J. Sivaswamyet al., “Drishti-GS: Retinal image dataset for optic nerve head segmentation,” inProc. IEEE ISBI, pp. 53–56, 2014

2014
[8]

RIM-ONE: An open retinal image database for optic nerve evaluation,

F. Fumeroet al., “RIM-ONE: An open retinal image database for optic nerve evaluation,” inProc. CBMS, pp. 1–6, 2011

2011
[9]

RIM-ONE DL: A unified retinal image database for assessing glaucoma using deep learning,

F. Fumeroet al., “RIM-ONE DL: A unified retinal image database for assessing glaucoma using deep learning,”Image Anal. Stereol., vol. 39, no. 3, pp. 161–167, 2020

2020
[10]

U-Net: Convolutional networks for biomedical image segmentation,

O. Ronnebergeret al., “U-Net: Convolutional networks for biomedical image segmentation,” inProc. MICCAI, pp. 234–241, 2015

2015
[11]

Attention U-Net: Learning where to look for the pancreas,

O. Oktayet al., “Attention U-Net: Learning where to look for the pancreas,” inProc. MIDL, 2018

2018
[12]

A novel edge-enhanced network for optic disc and optic cup segmentation,

M. Liuet al., “A novel edge-enhanced network for optic disc and optic cup segmentation,”Int. J. Imaging Syst. Technol., vol. 35, p. e70019, 2024

2024
[13]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

J. Chenet al., “TransUNet: Transformers make strong encoders for medical image segmentation,”arXiv preprint arXiv:2102.04306, 2021

work page internal anchor Pith review arXiv 2021
[14]

Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning,

L. Luoet al., “Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning,”Int. J. CARS, vol. 16, no. 6, pp. 905–914, 2021

2021
[15]

BEAC-Net: Boundary-enhanced adaptive context net- work for optic disk and optic cup segmentation,

L. Jianget al., “BEAC-Net: Boundary-enhanced adaptive context net- work for optic disk and optic cup segmentation,”Appl. Sci., vol. 13, no. 18, p. 10244, 2023

2023
[16]

NENet: Nested EfficientNet and adversarial learning for joint optic disc and cup segmentation,

S. Pachadeet al., “NENet: Nested EfficientNet and adversarial learning for joint optic disc and cup segmentation,”Med. Image Anal., vol. 74, p. 102253, 2021

2021
[17]

Boundary and entropy-driven adversarial learning for fundus image segmentation,

S. Wanget al., “Boundary and entropy-driven adversarial learning for fundus image segmentation,” inProc. MICCAI, pp. 102–110, 2019

2019
[18]

Unsupervised domain adaptation with shape constraint and triple attention for joint optic disc and cup segmentation,

F. Zhang, S. Li, and J. S. Deng, “Unsupervised domain adaptation with shape constraint and triple attention for joint optic disc and cup segmentation,”Sensors, vol. 22, no. 22, p. 8748, 2022

2022
[19]

A simplified deep network architecture on optic cup and disc segmentation,

G.-R. Huang and T.-R. Hsiang, “A simplified deep network architecture on optic cup and disc segmentation,” inProc. IJCNN, pp. 1–7, 2020

2020
[20]

Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation,

J. Zillyet al., “Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation,”Comput. Med. Imaging Graph., vol. 55, pp. 28–41, 2017

2017
[21]

Unsupervised domain adaptation based image synthesis and feature alignment for joint optic disc and cup segmentation,

H. Leiet al., “Unsupervised domain adaptation based image synthesis and feature alignment for joint optic disc and cup segmentation,”IEEE J. Biomed. Health Inform., vol. 26, no. 1, pp. 90–102, 2022

2022
[22]

Unsupervised domain adaptive fundus image segmenta- tion with category-level regularization,

W. Fenget al., “Unsupervised domain adaptive fundus image segmenta- tion with category-level regularization,” inProc. MICCAI, pp. 497–506, 2022

2022
[23]

TUNet and domain adaptation based learning for joint optic disc and cup segmentation,

Z. Liet al., “TUNet and domain adaptation based learning for joint optic disc and cup segmentation,”Comput. Biol. Med., vol. 163, p. 107209, 2023

2023
[24]

Tent: Fully test-time adaptation by entropy minimization.arXiv preprint arXiv:2006.10726,

D. Wanget al., “Fully test-time adaptation by entropy minimization,” arXiv preprint arXiv:2006.10726, 2020

work page arXiv 2006
[25]

Test-time adaptation with shape moments for image segmentation,

M. Batesonet al., “Test-time adaptation with shape moments for image segmentation,” inProc. MICCAI, pp. 736–745, 2022

2022
[26]

Pay attention to the atlas: Atlas-guided test-time adaptation for robust 3D medical image segmentation,

J. Guoet al., “Pay attention to the atlas: Atlas-guided test-time adaptation for robust 3D medical image segmentation,”arXiv preprint arXiv:2307.00676, 2023

work page arXiv 2023
[27]

Joint optic disc and cup segmentation using fully convolutional and adversarial networks,

S. M. Shankaranarayanaet al., “Joint optic disc and cup segmentation using fully convolutional and adversarial networks,” inProc. MICCAI Workshop, pp. 168–176, 2017

2017
[28]

Joint optic disc and cup segmentation based on multi- scale feature analysis and attention pyramid architecture for glaucoma screening,

G. Sunet al., “Joint optic disc and cup segmentation based on multi- scale feature analysis and attention pyramid architecture for glaucoma screening,”Neural Comput. Appl., vol. 35, no. 22, pp. 16129–16142, 2021

2021
[29]

Disc-aware ensemble network for glaucoma screening from fundus image,

H. Fuet al., “Disc-aware ensemble network for glaucoma screening from fundus image,”IEEE Trans. Med. Imag., vol. 37, no. 11, pp. 2493–2505, 2018

2018
[30]

CE-Net: Context encoder network for 2D medical image segmentation,

Z. Guet al., “CE-Net: Context encoder network for 2D medical image segmentation,”IEEE Trans. Med. Imag., vol. 38, no. 10, pp. 2281–2292, 2019

2019
[31]

CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation,

R. Guet al., “CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation,”Med. Image Anal., vol. 89, p. 102904, 2023

2023
[32]

Deep visual domain adaptation: A survey,

M. Wang and W. Deng, “Deep visual domain adaptation: A survey,” Neurocomputing, vol. 312, pp. 135–153, 2018

2018
[33]

Test-time augmentation-based active learning and self-training for label-efficient segmentation,

B. Specktor-Fadidaet al., “Test-time augmentation-based active learning and self-training for label-efficient segmentation,” inLNCS, vol. 14307, pp. 213–223, 2023

2023
[34]

DoFE: Domain-oriented feature embedding for gener- alizable fundus image segmentation on unseen datasets,

S. Wanget al., “DoFE: Domain-oriented feature embedding for gener- alizable fundus image segmentation on unseen datasets,”IEEE Trans. Med. Imag., vol. 39, no. 12, pp. 4237–4248, 2020

2020
[35]

PAPILA: Dataset with fundus images and clinical data of both eyes of the same patient for glaucoma assessment,

O. Kovalyket al., “PAPILA: Dataset with fundus images and clinical data of both eyes of the same patient for glaucoma assessment,”Sci. Data, vol. 9, p. 291, 2022

2022
[36]

NETRA: Enhancing glaucoma diagnosis through deep learning—A comparative clinical validation study,

R. Gopermaet al., “NETRA: Enhancing glaucoma diagnosis through deep learning—A comparative clinical validation study,” inProc. IEEE Region 10 Humanitarian Technol. Conf. (R10-HTC), Rajkot, India, pp. 691–698, 2023

2023
[37]

Spatial transformer networks,

M. Jaderberget al., “Spatial transformer networks,” inAdvances in Neural Information Processing Systems, vol. 28, 2015

2015