AMD Severity Prediction And Explainability Using Image Registration And Deep Embedded Clustering
Pith reviewed 2026-05-25 01:44 UTC · model grok-4.3
The pith
Deep learning registration and clustering predicts AMD severity from OCT images without a standard clinical scale.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By combining deep learning based image registration with deep embedded clustering, the method identifies diseased cases and predicts their severity levels in AMD OCT images, matching state-of-the-art classification accuracy and offering improved explainability through registration outputs compared to class activation maps.
What carries the argument
Deep embedded clustering applied to registered OCT images to group cases by inferred severity levels.
If this is right
- Classification performance matches state of the art methods.
- Predicted severity performs well on previously unseen data.
- Registration output provides better explainability than class activation maps for label and severity decisions.
Where Pith is reading between the lines
- Such methods could enable consistent severity tracking across different clinics despite varying clinical practices.
- Future work might test if these clusters align with actual patient outcomes like vision loss rates.
- Integration with longitudinal data could allow prediction of disease progression rates.
Load-bearing premise
That the clusters identified by deep embedded clustering correspond to meaningful differences in AMD severity even without a standard clinical scale.
What would settle it
A comparison showing that the predicted severity levels do not align with expert clinical assessments or patient outcomes on an independent test set.
Figures
read the original abstract
We propose a method to predict severity of age related macular degeneration (AMD) from input optical coherence tomography (OCT) images. Although there is no standard clinical severity scale for AMD, we leverage deep learning (DL) based image registration and clustering methods to identify diseased cases and predict their severity. Experiments demonstrate our approach's disease classification performance matches state of the art methods. The predicted disease severity performs well on previously unseen data. Registration output provides better explainability than class activation maps regarding label and severity decisions
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes using deep learning-based image registration combined with deep embedded clustering on OCT images to identify AMD cases and predict their severity levels, in the absence of a standard clinical scale. It claims that the disease classification performance matches state-of-the-art methods, that the severity predictions generalize well to unseen data, and that registration outputs provide superior explainability over class activation maps for both labels and severity decisions.
Significance. If the clusters derived from deep embedded clustering can be shown to align with clinically meaningful AMD severity distinctions and the registration-based explanations hold under external validation, the approach could enable quantitative severity assessment and improved interpretability in a domain lacking standardized scales. The reported matching of SOTA classification performance and generalization would be a useful contribution if substantiated with full experimental details.
major comments (2)
- [Abstract] Abstract: the central claim that 'the predicted disease severity performs well on previously unseen data' and matches SOTA classification performance is unsupported because the abstract (and available description) supplies no dataset details, metrics, baselines, validation procedures, or cross-validation scheme, preventing any assessment of whether the data actually supports the claims.
- [Abstract] Abstract/Methods (clustering pipeline): the assertion that clusters meaningfully encode AMD severity levels lacks an external clinical validation anchor (e.g., correlation to expert grades, visual acuity, or lesion counts). Since the paper notes there is no standard clinical severity scale, deriving severity labels directly from the fitted clusters creates a circularity risk that undermines the generalization and explainability claims.
minor comments (1)
- [Abstract] The abstract refers to 'registration output' for explainability but does not specify the registration algorithm, loss terms, or how alignment outputs are mapped to severity decisions versus CAMs.
Simulated Author's Rebuttal
Thank you for reviewing our manuscript. We address each of the major comments below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'the predicted disease severity performs well on previously unseen data' and matches SOTA classification performance is unsupported because the abstract (and available description) supplies no dataset details, metrics, baselines, validation procedures, or cross-validation scheme, preventing any assessment of whether the data actually supports the claims.
Authors: The abstract provides a high-level overview and is constrained by word limits. The full manuscript details the OCT dataset used, the performance metrics (such as accuracy and AUC), the state-of-the-art baselines compared against, and the validation procedures including the cross-validation scheme in the Experiments section. These support the claims made. We will revise the abstract to include brief mention of key results and dataset size to address this concern. revision: partial
-
Referee: [Abstract] Abstract/Methods (clustering pipeline): the assertion that clusters meaningfully encode AMD severity levels lacks an external clinical validation anchor (e.g., correlation to expert grades, visual acuity, or lesion counts). Since the paper notes there is no standard clinical severity scale, deriving severity labels directly from the fitted clusters creates a circularity risk that undermines the generalization and explainability claims.
Authors: We note that the lack of a standard clinical severity scale for AMD is stated in the paper and is the reason for adopting an unsupervised clustering approach to derive severity levels from the data. The clusters are obtained via deep embedded clustering on registered images, and we show that the severity predictions generalize to unseen data while the registration provides visual explanations. This avoids circularity because no pre-existing severity labels are used; the method discovers structure in the data. External validation with clinical anchors is a valuable direction for future work but is outside the scope of the current technical contribution focused on the registration and clustering pipeline. revision: no
Circularity Check
Severity labels and predictions both derive from the same deep embedded clustering step
specific steps
-
self definitional
[Abstract]
"Although there is no standard clinical severity scale for AMD, we leverage deep learning (DL) based image registration and clustering methods to identify diseased cases and predict their severity. ... The predicted disease severity performs well on previously unseen data."
Severity is introduced solely via the clustering output (no external scale exists), yet the same clustering procedure is then presented as producing independent 'predictions' whose quality is evaluated on held-out data. The performance metric therefore measures how well the clustering reproduces its own assignments rather than matching any external clinical quantity.
full rationale
The paper explicitly notes the absence of any external clinical severity scale and instead uses deep embedded clustering both to define severity categories and to generate the 'predictions' on unseen data. This makes the reported performance an internal consistency check on the clustering itself rather than an independent validation against clinical ground truth. No equations or self-citations are needed to see the reduction; the abstract states the construction directly.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
https://www.eyepacs.com
-
[2]
”the insight segmentation and registration toolkit” www.itk.org
- [3]
-
[4]
Computer Vision and Image Understanding 184, 57–65 (2019)
Bozorgtabar, B., Mahapatra, D., von Teng, H., Pollinger, A., Ebner, L., Thiran, J.P., M.Reyes.: Informative sample generation using class aware generative ad- versarial networks for classification of chest xrays. Computer Vision and Image Understanding 184, 57–65 (2019)
work page 2019
-
[5]
D. Mahapatra, B.B., Garnavi, R.: Image super-resolution using progressive gen- erative adversarial networks for medical image analysis. Computerized Medical Imaging and Graphics 71(1), 30–39 (2019)
work page 2019
-
[6]
al.: Flownet: Learning optical flow with convolu- tional networks
Dosovitskiy, A., Fischer, P., et. al.: Flownet: Learning optical flow with convolu- tional networks. In: In Proc. IEEE ICCV. pp. 2758–2766 (2015)
work page 2015
-
[7]
Graziani, M., Andrearczyk, V., Mller, H.: Regression concept vectors for bidirec- tional explanations in histopathology. In: In Proc. MICCAI-iMIMIC. pp. 124–132 (2018)
work page 2018
-
[8]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: In Proc. CVPR (2016)
work page 2016
-
[9]
Densely Connected Convolutional Networks
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.: Densely connected con- volutional networks. In: https://arxiv.org/abs/1608.06993, (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[10]
Kuang, H., Guthier, B., Saini, M., Mahapatra, D., Saddik, A.E.: A real-time smart assistant for video surveillance through handheld devices. In: In Proc: ACM Intl. Conf. Multimedia. pp. 917–920 (2014)
work page 2014
-
[11]
Li, Z., Mahapatra, D., J.Tielbeek, Stoker, J., van Vliet, L., Vos, F.: Image registra- tion based on autocorrelation of local structure. IEEE Trans. Med. Imaging 35(1), 63–75 (2016)
work page 2016
-
[12]
In: In Proc: MICCAI workshop on Image Analysis of Human Brain Devel- opment (IAHBD) (2011)
Mahapatra, D.: Neonatal brain mri skull stripping using graph cuts and shape priors. In: In Proc: MICCAI workshop on Image Analysis of Human Brain Devel- opment (IAHBD) (2011)
work page 2011
- [13]
-
[14]
In: In Proc: SPIE Medical Imaging (2012) 9
Mahapatra, D.: Groupwise registration of dynamic cardiac perfusion images using temporal information and segmentation information. In: In Proc: SPIE Medical Imaging (2012) 9
work page 2012
- [15]
-
[16]
Mahapatra, D.: Skull stripping of neonatal brain mri: Using prior shape information with graphcuts. J. Digit. Imaging 25(6), 802–814 (2012)
work page 2012
-
[17]
Mahapatra, D.: Cardiac image segmentation from cine cardiac mri using graph cuts and shape priors. J. Digit. Imaging 26(4), 721–730 (2013)
work page 2013
-
[18]
Mahapatra, D.: Cardiac mri segmentation using mutual context information from left and right ventricle. J. Digit. Imaging 26(5), 898–908 (2013)
work page 2013
- [19]
-
[20]
Mahapatra, D.: Joint segmentation and groupwise registration of cardiac perfusion images using temporal information. J. Digit. Imaging 26(2), 173–182 (2013)
work page 2013
-
[21]
Mahapatra, D.: Automatic cardiac segmentation using semantic information from random forests. J. Digit. Imaging. 27(6), 794–804 (2014)
work page 2014
-
[22]
Computer Vision and Image Understanding 151(1), 114–123 (2016)
Mahapatra, D.: Combining multiple expert annotations using semi-supervised learning and graph cuts for medical image segmentation. Computer Vision and Image Understanding 151(1), 114–123 (2016)
work page 2016
-
[23]
Pattern Recognition 63(1), 700–709 (2017)
Mahapatra, D.: Semi-supervised learning and graph cuts for consensus based med- ical image segmentation. Pattern Recognition 63(1), 700–709 (2017)
work page 2017
-
[24]
Mahapatra, D., Bozorgtabar, S., Hewavitahranage, S., Garnavi, R.: Image super resolution using generative adversarial networks and local saliencymaps for retinal image analysis,. In: In Proc. MICCAI. pp. 382–390 (2017)
work page 2017
-
[25]
Mahapatra, D., Bozorgtabar, S., Thiran, J.P., Reyes, M.: Efficient active learning for image classification and segmentation using a sample selection and conditional generative adversarial network. In: In Proc. MICCAI (2). pp. 580–588 (2018)
work page 2018
-
[26]
Mahapatra, D., Buhmann, J.: Obtaining consensus annotations for retinal image segmentation using random forest and graph cuts. In: In Proc. OMIA. pp. 41–48 (2015)
work page 2015
-
[27]
Mahapatra, D., Buhmann, J.: Visual saliency based active learning for prostate mri segmentation. In: In Proc. MLMI. pp. 9–16 (2015)
work page 2015
-
[28]
SPIE Journal of Medical Imaging 3(1) (2016)
Mahapatra, D., Buhmann, J.: Visual saliency based active learning for prostate mri segmentation. SPIE Journal of Medical Imaging 3(1) (2016)
work page 2016
- [29]
-
[30]
Mahapatra, D., Buhmann, J.: Analyzing training information from random forests for improved image segmentation. IEEE Trans. Imag. Proc. 23(4), 1504–1512 (2014)
work page 2014
-
[31]
Mahapatra, D., Buhmann, J.: Prostate mri segmentation using learned semantic knowledge and graph cuts. IEEE Trans. Biomed. Engg. 61(3), 756–764 (2014)
work page 2014
-
[32]
Mahapatra, D., Buhmann, J.: A field of experts model for optic cup and disc segmentation from retinal fundus images. In: In Proc. IEEE ISBI. pp. 218–221 (2015)
work page 2015
-
[33]
Mahapatra, D., Ge, Z.: Training data independent image registration with gans using transfer learning and segmentation information. In: In Proc. IEEE ISBI (2019)
work page 2019
-
[34]
Mahapatra, D., Ge, Z., Sedai, S., Chakravorty., R.: Joint registration and segmen- tation of xray images using generative adversarial networks. In: In Proc. MICCAI- MLMI. pp. 73–80 (2018) 10
work page 2018
-
[35]
IEEE Journal of Selected Topics in Signal Processing
Mahapatra, D., Gilani, S., Saini., M.: Coherency based spatio-temporal saliency detection for video object segmentation. IEEE Journal of Selected Topics in Signal Processing. 8(3), 454–462 (2014)
work page 2014
-
[36]
Mahapatra, D., J.Tielbeek, Makanyanga, J., Stoker, J., Taylor, S., Vos, F., Buh- mann, J.: Automatic detection and segmentation of crohn’s disease tissues from abdominal mri. IEEE Trans. Med. Imaging 32(12), 1232–1248 (2013)
work page 2013
- [37]
-
[38]
In: In Proc: MICCAI-ABD (2014)
Mahapatra, D., J.Tielbeek, Makanyanga, J., Stoker, J., Taylor, S., Vos, F., Buh- mann, J.: Combiningmultiple expert annotations using semi-supervised learning and graph cuts for crohns disease segmentation. In: In Proc: MICCAI-ABD (2014)
work page 2014
-
[39]
Mahapatra, D., J.Tielbeek, Vos, F., Buhmann, J.: A supervised learning approach for crohn’s disease detection using higher order image statistics and a novel shape asymmetry measure. J. Digit. Imaging 26(5), 920–931 (2013)
work page 2013
-
[40]
Mahapatra, D., Li, Z., Vos, F., Buhmann, J.: Joint segmentation and groupwise registration of cardiac dce mri using sparse data representations. In: In Proc. IEEE ISBI. pp. 1312–1315 (2015)
work page 2015
-
[41]
In: IEEE International Conference on Industrial Technology (ICIT)
Mahapatra, D., Routray, A., Mishra, C.: An active snake model for classification of extreme emotions. In: IEEE International Conference on Industrial Technology (ICIT). pp. 2195–2199 (2006)
work page 2006
-
[42]
Mahapatra, D., Roy, P., Sedai, S., Garnavi, R.: A cnn based neurobiology inspired approach for retinal image quality assessment. In: In Proc. EMBC. pp. 1304–1307 (2016)
work page 2016
-
[43]
Mahapatra, D., Roy, P., Sedai, S., Garnavi, R.: Retinal image quality classification using saliency maps and cnns. In: In Proc. MICCAI-MLMI. pp. 172–179 (2016)
work page 2016
-
[44]
In: In 13th International Conference on Biomedical Engineering (2008)
Mahapatra, D., Roy, S., Sun, Y.: Retrieval of mr kidney images by incorporating spatial information in histogram of low level features. In: In 13th International Conference on Biomedical Engineering (2008)
work page 2008
-
[45]
Mahapatra, D., Saini, M., Sun, Y.: Illumination invariant tracking in office envi- ronments using neurobiology-saliency based particle filter. In: IEEE ICME. pp. 953–956 (2008)
work page 2008
- [46]
- [47]
- [48]
-
[49]
In: In 13th International Conference on Biomedical Engi- neering (2008)
Mahapatra, D., Sun, Y.: Using saliency features for graphcut segmentation of per- fusion kidney images. In: In 13th International Conference on Biomedical Engi- neering (2008)
work page 2008
- [50]
- [51]
- [52]
-
[53]
EURASIP Journal on Image and Video Processing
Mahapatra, D., Sun, Y.: Rigid registration of renal perfusion images using a neu- robiology based visual saliency model. EURASIP Journal on Image and Video Processing. pp. 1–16 (2010)
work page 2010
- [54]
-
[55]
Mahapatra, D., Sun, Y.: Mrf based intensity invariant elastic registration of cardiac perfusion images using saliency information. IEEE Trans. Biomed. Engg. 58(4), 991–1000 (2011)
work page 2011
-
[56]
Mahapatra, D., Sun, Y.: Orientation histograms as shape priors for left ventricle segmentation using graph cuts. In: In Proc: MICCAI. pp. 420–427 (2011)
work page 2011
-
[57]
Mahapatra, D., Sun, Y.: Integrating segmentation information for improved mrf- based elastic image registration. IEEE Trans. Imag. Proc. 21(1), 170–183 (2012)
work page 2012
- [58]
- [59]
- [60]
- [61]
-
[62]
Mahapatra, D., Vos, F., Buhmann, J.: Crohn’s disease segmentation from mri using learned image priors. In: In Proc. IEEE ISBI. pp. 625–628 (2015)
work page 2015
-
[63]
Computer Methods and Programs in Biomedicine 128(1), 75–85 (2016)
Mahapatra, D., Vos, F., Buhmann, J.: Active learning based segmentation of crohns disease from abdominal mri. Computer Methods and Programs in Biomedicine 128(1), 75–85 (2016)
work page 2016
-
[64]
Mahapatra, D., Winkler, S., Yen, S.: Motion saliency outweighs other low-level features while watching videos. In: SPIE HVEI. pp. 1–10 (2008)
work page 2008
-
[65]
Pereira, S., Meier, R., Alves, V., Reyes, M., Silva., C.: Automatic brain tumor grading from mri data using convolutional neural networks and quality assessment. In: In Proc. MICCAI-iMIMIC. pp. 106–114 (2018)
work page 2018
-
[66]
Rasti, R., Rabbani, H., Mehri, A., Hajizadeh, F.: Macular oct classification using a multi-scale convolutional neural network ensemble. IEEE Trans. Med. Imag. 37(4), 1024–1034 (2018)
work page 2018
-
[67]
Very Deep Convolutional Networks for Large-Scale Image Recognition
Simonyan, K., Zisserman., A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[68]
Sokooti, H., de Vos, B., Berendsen, F., Lelieveldt, B., Isgum, I., Staring, M.: Non- rigid image registration using multiscale 3d convolutional neural networks. In: MIC- CAI. pp. 232–239 (2017)
work page 2017
-
[69]
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denois- ing autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Mach. Learn. Res. 11, 3371–3408 (2010)
work page 2010
-
[70]
End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network
de Vos, B., Berendsen, F., Viergever, M., Staring, M., Isgum, I.: End-to-end un- supervised deformable image registration with a convolutional neural network. In: arXiv preprint arXiv:1704.06065 (2017) 12
work page internal anchor Pith review Pith/arXiv arXiv 2017
- [71]
- [72]
- [73]
-
[74]
Zilly, J., Buhmann, J., Mahapatra, D.: Boosting convolutional filters with entropy sampling for optic cup and disc image segmentation from fundus images. In: In Proc. MLMI. pp. 136–143 (2015)
work page 2015
-
[75]
In Press Computerized Medical Imaging and Graphics 55(1), 28–41 (2017)
Zilly, J., Buhmann, J., Mahapatra, D.: Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. In Press Computerized Medical Imaging and Graphics 55(1), 28–41 (2017)
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.