Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening

Jan H. Terheyden; Lorenz Sparrenberg; Muskaan Chopra; Rafet Sifa

arxiv: 2605.19133 · v1 · pith:DUFL6ACMnew · submitted 2026-05-18 · 💻 cs.CV · cs.AI

Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening

Muskaan Chopra , Lorenz Sparrenberg , Jan H. Terheyden , Rafet Sifa This is my paper

Pith reviewed 2026-05-20 10:29 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords self-supervised learningabstentiondiabetic retinopathyselective predictionconfidence calibrationmedical image analysis

0 comments

The pith

Self-supervised pretraining improves selective prediction in diabetic retinopathy screening, but longer pretraining does not consistently enhance reliability after accuracy saturates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines the impact of self-supervised learning pretraining duration on a model's ability to abstain from unreliable predictions in diabetic retinopathy grading. The authors evaluate multiple checkpoints from SSL pretraining using a fixed fine-tuning setup and measure performance on calibrated confidence and abstention metrics such as coverage and selective accuracy. They find that SSL pretraining leads to better selective prediction than training models from scratch across various datasets and data regimes. However, selective performance can still vary significantly across different checkpoints even when overall accuracy has stopped improving. This highlights the need to consider pretraining length as a factor in building reliable, safety-aware medical AI systems rather than focusing solely on accuracy.

Core claim

The paper claims that SSL pretraining improves calibrated confidence and confidence-based abstention compared to training from scratch, yet once accuracy saturates, selective performance can still change markedly across checkpoints and longer pretraining does not consistently improve reliability.

What carries the argument

calibrated confidence-based abstention under a fixed fine-tuning protocol applied to multiple SSL checkpoints

If this is right

SSL pretraining enhances selective accuracy and selective macro-F1 across datasets compared to from-scratch training.
Selective performance varies across checkpoints even after accuracy plateaus.
Extending pretraining duration does not uniformly improve abstention reliability.
Abstention-aware evaluation is necessary for assessing safety in clinical screening tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Developers may need to track selective metrics across many checkpoints instead of stopping at the accuracy peak.
The same checkpoint-selection approach could be tested on other medical imaging tasks that require safe deferral.
In deployment, reliability monitoring might shift from single best-accuracy models to families of checkpoints evaluated for abstention quality.

Load-bearing premise

The fixed fine-tuning protocol combined with calibrated confidence-based abstention metrics sufficiently captures real-world clinical safety, and the chosen datasets represent the variability in actual screening practice.

What would settle it

A new DR dataset where SSL pretraining fails to raise selective accuracy or selective macro-F1 above from-scratch baselines, or where longer pretraining checkpoints show steadily worse abstention after accuracy has plateaued, would falsify the reported pattern.

Figures

Figures reproduced from arXiv: 2605.19133 by Jan H. Terheyden, Lorenz Sparrenberg, Muskaan Chopra, Rafet Sifa.

**Figure 2.** Figure 2: Selective prediction comparison between SiCoVa and [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Grad-CAM visualizations for SiCoVa across three down [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Representation structure and severity-dependent abstention using PaCMAP and class-wise acceptance statistics at 70% coverage. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Self-supervised learning (SSL) is now a standard way to pretrain medical image models, but performance is still mostly judged by downstream accuracy. For safety-critical screening tasks such as diabetic retinopathy grading, this is not enough: a model must also know when its predictions are unreliable and defer uncertain cases for clinical review. In this work, we examine how the length of SSL pretraining influences calibrated confidence and confidence-based abstention. We evaluate multiple SSL checkpoints under a fixed fine-tuning protocol and assess calibrated confidence, coverage, selective accuracy, and selective macro-F1. Across datasets and data regimes, SSL pretraining improves selective prediction compared to training from scratch. Unlike prior SSL studies that primarily evaluate downstream accuracy or AUROC, we analyze how SSL pretraining duration influences confidence behavior under calibrated confidence-based abstention. However, once accuracy saturates, selective performance can still change markedly across checkpoints, and longer pretraining does not consistently improve reliability. These results underscore the importance of abstention-aware evaluation and suggest that pretraining length should be treated as an important reliability-related design choice rather than only a computational detail. Code is available at GitHub.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SSL pretraining duration still moves selective prediction metrics in DR screening after accuracy saturates, but the single fixed fine-tuning protocol leaves open whether those shifts are intrinsic or protocol-dependent.

read the letter

The key point is that SSL pretraining length affects calibrated confidence and abstention behavior in diabetic retinopathy grading even after accuracy has stopped improving, and longer pretraining does not reliably produce better selective performance. The paper compares multiple SSL checkpoints against a from-scratch baseline under one fixed fine-tuning setup and reports gains in coverage, selective accuracy, and selective macro-F1 across datasets and data regimes. That focus on abstention-aware metrics rather than accuracy or AUROC alone is the clearest addition to the existing SSL literature in medical imaging. They also release code, which helps with checking the exact protocols. The observation that selective metrics can keep changing after accuracy plateaus is practically relevant for anyone planning to deploy these models with a deferral option. The main soft spot is the fixed fine-tuning protocol itself. If optimal hyperparameters shift with pretraining duration, then differences in abstention could partly trace back to that mismatch instead of the representations. The stress-test note flags this correctly, and without sensitivity checks on the optimizer or schedule the isolation of pretraining length is only partial. Dataset coverage is another limitation worth noting: the usual DR collections are used, but they do not fully sample camera types, resolutions, or population shifts that appear in real screening programs. The abstract is thin on error bars and exact numbers, though the full text presumably supplies them. This work is aimed at researchers building reliable medical vision systems who already care about selective prediction. A reader working on SSL for safety-critical tasks would find the checkpoint analysis useful. It is coherent enough and grounded enough to deserve a serious referee who can press on the fine-tuning controls and ask for more ablations.

Referee Report

3 major / 2 minor

Summary. The paper examines how the duration of self-supervised learning (SSL) pretraining affects calibrated confidence and confidence-based abstention in diabetic retinopathy (DR) grading models. It evaluates multiple SSL checkpoints under a single fixed fine-tuning protocol, reporting metrics of coverage, selective accuracy, and selective macro-F1. The central claims are that SSL pretraining improves selective prediction relative to training from scratch across datasets and regimes, yet selective performance can still vary substantially once accuracy saturates and that longer pretraining does not reliably improve reliability. The work argues for abstention-aware evaluation in safety-critical screening tasks.

Significance. If the reported patterns hold under more varied protocols and datasets, the result would be significant for medical imaging: it shows that accuracy saturation does not imply stable selective performance and that pretraining length is a reliability design choice rather than a mere computational detail. The public code release is a clear strength that aids verification.

major comments (3)

[Experimental protocol / Results] The central claim that SSL pretraining improves selective metrics rests on a fixed fine-tuning protocol applied uniformly to all checkpoints and the from-scratch baseline. Without evidence that this protocol remains optimal or non-interacting with checkpoint quality (e.g., no ablation of learning-rate schedules or epoch counts per checkpoint), observed fluctuations in selective accuracy and macro-F1 after accuracy saturation could be protocol artifacts rather than intrinsic properties of the pretrained representations.
[Datasets and evaluation] The abstract and results assert improvements “across datasets and data regimes,” yet the manuscript supplies no quantitative description of dataset diversity (camera models, resolution distributions, population demographics, or label-noise levels). This omission is load-bearing for the safety conclusion, because selective-prediction gains on homogeneous data do not necessarily translate to the variability encountered in clinical DR screening.
[Results] The observation that “longer pretraining does not consistently improve reliability” is presented without statistical quantification of the variation (e.g., standard errors on selective macro-F1 across checkpoints once accuracy plateaus, or a formal test for trend). A single fixed protocol plus limited checkpoint sampling leaves open the possibility that the non-monotonic behavior is under-powered rather than a robust negative finding.

minor comments (2)

[Methods] Clarify in the methods whether the confidence calibration (temperature scaling or similar) is performed on a held-out validation set or on the same data used for selective-metric computation; the current description leaves room for optimistic bias.
[Figures] Figure captions and axis labels should explicitly state the number of checkpoints, the exact pretraining epochs or steps corresponding to each point, and whether error bars represent standard deviation over multiple fine-tuning seeds.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the detailed review. We appreciate the referee's focus on experimental rigor and generalizability. Below we respond to each major comment, proposing revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: The central claim that SSL pretraining improves selective metrics rests on a fixed fine-tuning protocol applied uniformly to all checkpoints and the from-scratch baseline. Without evidence that this protocol remains optimal or non-interacting with checkpoint quality (e.g., no ablation of learning-rate schedules or epoch counts per checkpoint), observed fluctuations in selective accuracy and macro-F1 after accuracy saturation could be protocol artifacts rather than intrinsic properties of the pretrained representations.

Authors: We intentionally employed a single fixed fine-tuning protocol across all SSL checkpoints and the from-scratch baseline to isolate the impact of pretraining duration on calibrated confidence and selective performance. This approach prevents confounding variables from differing optimization strategies and enables a controlled comparison. While we agree that protocol interactions could exist, the consistent improvements in selective metrics under this protocol support our claims. We will revise the discussion to explicitly acknowledge this design choice as a potential limitation and suggest that future studies explore adaptive fine-tuning per checkpoint. revision: partial
Referee: The abstract and results assert improvements “across datasets and data regimes,” yet the manuscript supplies no quantitative description of dataset diversity (camera models, resolution distributions, population demographics, or label-noise levels). This omission is load-bearing for the safety conclusion, because selective-prediction gains on homogeneous data do not necessarily translate to the variability encountered in clinical DR screening.

Authors: We will update the manuscript to provide a more detailed quantitative description of the datasets, including information on camera models, image resolutions, patient demographics, and any available details on label noise. The experiments use well-established public DR datasets that are representative of clinical variability, but we concur that explicit quantification will better support the generalizability of our safety-related conclusions. revision: yes
Referee: The observation that “longer pretraining does not consistently improve reliability” is presented without statistical quantification of the variation (e.g., standard errors on selective macro-F1 across checkpoints once accuracy plateaus, or a formal test for trend). A single fixed protocol plus limited checkpoint sampling leaves open the possibility that the non-monotonic behavior is under-powered rather than a robust negative finding.

Authors: We will incorporate statistical quantification by adding standard errors to the reported selective metrics and conducting a formal trend analysis or non-parametric test for the plateau region across checkpoints. This will provide stronger evidence for the observed non-monotonic behavior in reliability metrics. revision: yes

Circularity Check

0 steps flagged

Empirical evaluation study with no circular derivation or self-referential reduction

full rationale

This is an experimental comparison paper that evaluates SSL pretraining duration effects on selective prediction metrics (coverage, selective accuracy, selective macro-F1) under a fixed fine-tuning protocol, contrasting against training-from-scratch baselines across datasets. No mathematical derivation chain exists; claims rest on direct empirical measurements rather than first-principles results that reduce to inputs by construction. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described methods. The fixed-protocol design and dataset comparisons are presented as external benchmarks, rendering the central findings self-contained without circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is an empirical evaluation; no free parameters, axioms, or invented entities are identifiable from the abstract.

pith-pipeline@v0.9.0 · 5744 in / 1054 out tokens · 40684 ms · 2026-05-20T10:29:31.246506+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We evaluate multiple SSL checkpoints under a fixed fine-tuning protocol and assess calibrated confidence, coverage, selective accuracy, and selective macro-F1.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

longer pretraining does not consistently improve reliability

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 2 internal anchors

[1]

Advances in Neural Information Processing Systems , year =

Selective Classification for Deep Neural Networks , author =. Advances in Neural Information Processing Systems , year =

work page
[2]

Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

On Calibration of Modern Neural Networks , author =. Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

work page
[3]

2019 , howpublished =

APTOS 2019 Blindness Detection , author =. 2019 , howpublished =

work page 2019
[4]

Mobile Networks and Applications , volume =

A Review of Deep Learning on Medical Image Analysis , author =. Mobile Networks and Applications , volume =. 2021 , doi =

work page 2021
[5]

2022 , eprint=

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning , author=. 2022 , eprint=

work page 2022
[6]

In Defense of the Triplet Loss for Person Re-Identification

In Defense of the Triplet Loss for Person Re-Identification , author =. arXiv preprint arXiv:1703.07737 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Computer Methods and Programs in Biomedicine , volume =

Shen, Tianyu and Gou, Chao and Wang, Fei-Yue and He, Zilong and Chen, Weiguo , title =. Computer Methods and Programs in Biomedicine , volume =. 2019 , url =

work page 2019
[8]

, title =

Ciga, Olivier and Xu, Tony and Martel, Anne L. , title =. Machine Learning with Applications , volume =. 2022 , doi =

work page 2022
[9]

Medical Image Analysis , volume =

A novel multiple instance learning framework for COVID-19 severity assessment via data augmentation and self-supervised learning , author =. Medical Image Analysis , volume =. 2021 , doi =

work page 2021
[10]

2020 , eprint =

A Simple Framework for Contrastive Learning of Visual Representations , author =. 2020 , eprint =

work page 2020
[11]

2021 , eprint =

Barlow Twins: Self-Supervised Learning via Redundancy Reduction , author =. 2021 , eprint =

work page 2021
[12]

Entropy , volume =

Albelwi, Saleh , title =. Entropy , volume =. 2022 , doi =

work page 2022
[13]

and Marias, Kostas , title =

Tsiknakis, Nikos and Theodoropoulos, Dimitris and Manikis, Georgios and Ktistakis, Emmanouil and Boutsora, Ourania and Berto, Alexa and Scarpa, Fabio and Scarpa, Alberto and Fotiadis, Dimitrios I. and Marias, Kostas , title =. Computers in Biology and Medicine , volume =. 2021 , doi =

work page 2021
[14]

Logprompt: A log-based anomaly detection framework using prompts

Learning Self-Supervised Representations for Label Efficient Cross-Domain Knowledge Transfer on Diabetic Retinopathy Fundus Images , author =. 2023 International Joint Conference on Neural Networks (IJCNN) , year =. doi:10.1109/IJCNN54540.2023.10191796 , url =

work page doi:10.1109/ijcnn54540.2023.10191796 2023
[15]

Progress in Artificial Intelligence , volume =

Self-supervised approach for diabetic retinopathy severity detection using vision transformer , author =. Progress in Artificial Intelligence , volume =. 2024 , doi =

work page 2024
[16]

Multimedia Tools and Applications , year =

Grading the severity of diabetic retinopathy using an ensemble of self-supervised pre-trained convolutional neural networks: ESSP-CNNs , author =. Multimedia Tools and Applications , year =. doi:10.1007/s11042-024-18968-5 , url =

work page doi:10.1007/s11042-024-18968-5
[17]

Proceedings of Machine Learning for Health , pages =

How Transferable Are Self-supervised Features in Medical Image Classification Tasks? , author =. Proceedings of Machine Learning for Health , pages =. 2021 , volume =

work page 2021
[18]

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Momentum Contrast for Unsupervised Visual Representation Learning , author =. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =. doi:10.1109/CVPR42600.2020.00975 , url =

work page doi:10.1109/cvpr42600.2020.00975 2020
[19]

Advances in Neural Information Processing Systems , volume =

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , author =. Advances in Neural Information Processing Systems , volume =. 2020 , url =

work page 2020
[20]

Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs , volume =

Gulshan, Varun and Peng, Lily and Coram, Marc and Stumpe, Martin and Wu, Derek and Narayanaswamy, Arunachalam and Venugopalan, Subhashini and Widner, Kasumi and Madams, Tom and Cuadros, Jorge and Kim, Ramasamy and Raman, Rajiv and Nelson, Philip and Mega, Jessica and Webster, Dale , year =. Development and Validation of a Deep Learning Algorithm for Detec...

work page
[21]

Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy , volume =

Krause, Jonathan and Gulshan, Varun and Rahimy, Ehsan and Karth, Peter and Widner, Kasumi and Corrado, Greg and Peng, Lily and Webster, Dale , year =. Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy , volume =. Ophthalmology , doi =

work page
[22]

JAMA Ophthalmology , volume =

Performance of a Deep-Learning Algorithm vs Manual Grading for Detecting Diabetic Retinopathy in India , author =. JAMA Ophthalmology , volume =. 2019 , doi =

work page 2019
[23]

Proceedings of the 36th International Conference on Machine Learning , pages =

SelectiveNet: A Deep Neural Network with an Integrated Reject Option , author =. Proceedings of the 36th International Conference on Machine Learning , pages =. 2019 , editor =

work page 2019
[24]

De Fauw, Jeffrey and Ledsam, Joseph R. and Romera-Paredes, Bernardino and Nikolov, Stanislav and Tomasev, Nenad and Blackwell, Sam and Askham, Harry and Glorot, Xavier and O'Donoghue, Brendan and Visentin, Daniel and Van Den Driessche, George and Lakshminarayanan, Balaji and Meyer, Clemens and Mackinder, Faith and Bouton, Simon and Ayoub, Kareem and Chopr...

work page 2018
[25]

El-Yaniv, Ran and Wiener, Yair , title =. J. Mach. Learn. Res. , month = aug, pages =. 2010 , issue_date =

work page 2010
[26]

and Gardel-Sotomayor, Pedro E

C Benítez, Veronica Elisa and Castro Matto, Ingrid and Mello Román, Julio César and Vázquez Noguera, José Luis and García-Torres, Miguel and Ayala, Jordan and Pinto-Roa, Diego P. and Gardel-Sotomayor, Pedro E. and Facon, Jacques and Grillo, Sebastian Alberto , title =. 2021 , publisher =. doi:10.5281/zenodo.4647952 , url =

work page doi:10.5281/zenodo.4647952 2021
[27]

Kaggle EyePACS Diabetic Retinopathy Detection , howpublished =

work page
[28]

Messidor-2 , howpublished =

work page
[29]

2024 , note =

ClementP , title =. 2024 , note =

work page 2024
[30]

IEEE Transactions on information theory , volume=

On optimum recognition error and reject tradeoff , author=. IEEE Transactions on information theory , volume=. 2003 , publisher=

work page 2003
[31]

Journal of Machine Learning Research , volume=

Optimal strategies for reject option classifiers , author=. Journal of Machine Learning Research , volume=

work page
[32]

medRxiv , pages=

Conformal triage for medical imaging AI deployment , author=. medRxiv , pages=. 2024 , publisher=

work page 2024
[33]

arXiv preprint arXiv:2305.15508 , year=

How to fix a broken confidence estimator: Evaluating post-hoc methods for selective classification with deep neural networks , author=. arXiv preprint arXiv:2305.15508 , year=

work page arXiv
[34]

2015 , eprint=

Learning Deep Features for Discriminative Localization , author=. 2015 , eprint=

work page 2015
[35]

2024 , issue_date =

Park, Wongi and Ryu, Jongbin , title =. 2024 , issue_date =. doi:10.1016/j.compbiomed.2024.108460 , journal =

work page doi:10.1016/j.compbiomed.2024.108460 2024
[36]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[37]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2010
[38]

2017 10th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) , pages=

Convolutional neural networks based transfer learning for diabetic retinopathy fundus image classification , author=. 2017 10th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) , pages=. 2017 , organization=

work page 2017

[1] [1]

Advances in Neural Information Processing Systems , year =

Selective Classification for Deep Neural Networks , author =. Advances in Neural Information Processing Systems , year =

work page

[2] [2]

Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

On Calibration of Modern Neural Networks , author =. Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

work page

[3] [3]

2019 , howpublished =

APTOS 2019 Blindness Detection , author =. 2019 , howpublished =

work page 2019

[4] [4]

Mobile Networks and Applications , volume =

A Review of Deep Learning on Medical Image Analysis , author =. Mobile Networks and Applications , volume =. 2021 , doi =

work page 2021

[5] [5]

2022 , eprint=

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning , author=. 2022 , eprint=

work page 2022

[6] [6]

In Defense of the Triplet Loss for Person Re-Identification

In Defense of the Triplet Loss for Person Re-Identification , author =. arXiv preprint arXiv:1703.07737 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Computer Methods and Programs in Biomedicine , volume =

Shen, Tianyu and Gou, Chao and Wang, Fei-Yue and He, Zilong and Chen, Weiguo , title =. Computer Methods and Programs in Biomedicine , volume =. 2019 , url =

work page 2019

[8] [8]

, title =

Ciga, Olivier and Xu, Tony and Martel, Anne L. , title =. Machine Learning with Applications , volume =. 2022 , doi =

work page 2022

[9] [9]

Medical Image Analysis , volume =

A novel multiple instance learning framework for COVID-19 severity assessment via data augmentation and self-supervised learning , author =. Medical Image Analysis , volume =. 2021 , doi =

work page 2021

[10] [10]

2020 , eprint =

A Simple Framework for Contrastive Learning of Visual Representations , author =. 2020 , eprint =

work page 2020

[11] [11]

2021 , eprint =

Barlow Twins: Self-Supervised Learning via Redundancy Reduction , author =. 2021 , eprint =

work page 2021

[12] [12]

Entropy , volume =

Albelwi, Saleh , title =. Entropy , volume =. 2022 , doi =

work page 2022

[13] [13]

and Marias, Kostas , title =

Tsiknakis, Nikos and Theodoropoulos, Dimitris and Manikis, Georgios and Ktistakis, Emmanouil and Boutsora, Ourania and Berto, Alexa and Scarpa, Fabio and Scarpa, Alberto and Fotiadis, Dimitrios I. and Marias, Kostas , title =. Computers in Biology and Medicine , volume =. 2021 , doi =

work page 2021

[14] [14]

Logprompt: A log-based anomaly detection framework using prompts

Learning Self-Supervised Representations for Label Efficient Cross-Domain Knowledge Transfer on Diabetic Retinopathy Fundus Images , author =. 2023 International Joint Conference on Neural Networks (IJCNN) , year =. doi:10.1109/IJCNN54540.2023.10191796 , url =

work page doi:10.1109/ijcnn54540.2023.10191796 2023

[15] [15]

Progress in Artificial Intelligence , volume =

Self-supervised approach for diabetic retinopathy severity detection using vision transformer , author =. Progress in Artificial Intelligence , volume =. 2024 , doi =

work page 2024

[16] [16]

Multimedia Tools and Applications , year =

Grading the severity of diabetic retinopathy using an ensemble of self-supervised pre-trained convolutional neural networks: ESSP-CNNs , author =. Multimedia Tools and Applications , year =. doi:10.1007/s11042-024-18968-5 , url =

work page doi:10.1007/s11042-024-18968-5

[17] [17]

Proceedings of Machine Learning for Health , pages =

How Transferable Are Self-supervised Features in Medical Image Classification Tasks? , author =. Proceedings of Machine Learning for Health , pages =. 2021 , volume =

work page 2021

[18] [18]

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Momentum Contrast for Unsupervised Visual Representation Learning , author =. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =. doi:10.1109/CVPR42600.2020.00975 , url =

work page doi:10.1109/cvpr42600.2020.00975 2020

[19] [19]

Advances in Neural Information Processing Systems , volume =

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , author =. Advances in Neural Information Processing Systems , volume =. 2020 , url =

work page 2020

[20] [20]

Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs , volume =

Gulshan, Varun and Peng, Lily and Coram, Marc and Stumpe, Martin and Wu, Derek and Narayanaswamy, Arunachalam and Venugopalan, Subhashini and Widner, Kasumi and Madams, Tom and Cuadros, Jorge and Kim, Ramasamy and Raman, Rajiv and Nelson, Philip and Mega, Jessica and Webster, Dale , year =. Development and Validation of a Deep Learning Algorithm for Detec...

work page

[21] [21]

Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy , volume =

Krause, Jonathan and Gulshan, Varun and Rahimy, Ehsan and Karth, Peter and Widner, Kasumi and Corrado, Greg and Peng, Lily and Webster, Dale , year =. Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy , volume =. Ophthalmology , doi =

work page

[22] [22]

JAMA Ophthalmology , volume =

Performance of a Deep-Learning Algorithm vs Manual Grading for Detecting Diabetic Retinopathy in India , author =. JAMA Ophthalmology , volume =. 2019 , doi =

work page 2019

[23] [23]

Proceedings of the 36th International Conference on Machine Learning , pages =

SelectiveNet: A Deep Neural Network with an Integrated Reject Option , author =. Proceedings of the 36th International Conference on Machine Learning , pages =. 2019 , editor =

work page 2019

[24] [24]

De Fauw, Jeffrey and Ledsam, Joseph R. and Romera-Paredes, Bernardino and Nikolov, Stanislav and Tomasev, Nenad and Blackwell, Sam and Askham, Harry and Glorot, Xavier and O'Donoghue, Brendan and Visentin, Daniel and Van Den Driessche, George and Lakshminarayanan, Balaji and Meyer, Clemens and Mackinder, Faith and Bouton, Simon and Ayoub, Kareem and Chopr...

work page 2018

[25] [25]

El-Yaniv, Ran and Wiener, Yair , title =. J. Mach. Learn. Res. , month = aug, pages =. 2010 , issue_date =

work page 2010

[26] [26]

and Gardel-Sotomayor, Pedro E

C Benítez, Veronica Elisa and Castro Matto, Ingrid and Mello Román, Julio César and Vázquez Noguera, José Luis and García-Torres, Miguel and Ayala, Jordan and Pinto-Roa, Diego P. and Gardel-Sotomayor, Pedro E. and Facon, Jacques and Grillo, Sebastian Alberto , title =. 2021 , publisher =. doi:10.5281/zenodo.4647952 , url =

work page doi:10.5281/zenodo.4647952 2021

[27] [27]

Kaggle EyePACS Diabetic Retinopathy Detection , howpublished =

work page

[28] [28]

Messidor-2 , howpublished =

work page

[29] [29]

2024 , note =

ClementP , title =. 2024 , note =

work page 2024

[30] [30]

IEEE Transactions on information theory , volume=

On optimum recognition error and reject tradeoff , author=. IEEE Transactions on information theory , volume=. 2003 , publisher=

work page 2003

[31] [31]

Journal of Machine Learning Research , volume=

Optimal strategies for reject option classifiers , author=. Journal of Machine Learning Research , volume=

work page

[32] [32]

medRxiv , pages=

Conformal triage for medical imaging AI deployment , author=. medRxiv , pages=. 2024 , publisher=

work page 2024

[33] [33]

arXiv preprint arXiv:2305.15508 , year=

How to fix a broken confidence estimator: Evaluating post-hoc methods for selective classification with deep neural networks , author=. arXiv preprint arXiv:2305.15508 , year=

work page arXiv

[34] [34]

2015 , eprint=

Learning Deep Features for Discriminative Localization , author=. 2015 , eprint=

work page 2015

[35] [35]

2024 , issue_date =

Park, Wongi and Ryu, Jongbin , title =. 2024 , issue_date =. doi:10.1016/j.compbiomed.2024.108460 , journal =

work page doi:10.1016/j.compbiomed.2024.108460 2024

[36] [36]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page

[37] [37]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2010

[38] [38]

2017 10th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) , pages=

Convolutional neural networks based transfer learning for diabetic retinopathy fundus image classification , author=. 2017 10th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) , pages=. 2017 , organization=

work page 2017