JDCNet: Confidence-Gated Privileged-Modality Distillation for Cost-Preserving X-ray Inference

Bo Ma; Hongjiang Wei; Jinsong Wu; Kun Liu; Weiqi Yan

arxiv: 2603.29167 · v2 · pith:CJW7Q5X7new · submitted 2026-03-31 · 💻 cs.CV

JDCNet: Confidence-Gated Privileged-Modality Distillation for Cost-Preserving X-ray Inference

Bo Ma , Jinsong Wu , Weiqi Yan , Hongjiang Wei , Kun Liu This is my paper

Pith reviewed 2026-05-21 10:55 UTC · model grok-4.3

classification 💻 cs.CV

keywords privileged modality distillationconfidence gatingX-ray inferenceCT to X-raymedical image classificationcost preserving deploymentknowledge distillation

0 comments

The pith

JDCNet shows that gating CT distillation by teacher confidence improves X-ray model performance at no extra inference cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors introduce JDCNet to solve the problem of using expensive CT scans during training to boost a model that runs only on X-rays at test time. By applying a confidence threshold to decide when to use the CT teacher's predictions as soft or hard targets for the student, the approach selectively transfers knowledge. On a dataset of 510 paired patients, this leads to measurable gains in balanced accuracy compared to training the X-ray model from scratch. Other common distillation methods did not achieve the same improvement under the same conditions. The finding is limited to this one cohort and calls for validation on additional paired datasets.

Core claim

On the BIMCV cohort with patient-level cross-validation, confidence-gated soft-KL supervision from 3-slice CT improves balanced accuracy by 0.035 and mid-slice hard supervision by 0.033 over the supervised ResNet-18 baseline, while ungated logit distillation and several other transfer techniques do not clear the performance gate.

What carries the argument

A confidence threshold that filters which training samples receive auxiliary targets derived from the CT teacher model.

Load-bearing premise

The gains from confidence-gated distillation will replicate on other independent paired CT-X-ray datasets beyond the 510-patient BIMCV cohort.

What would settle it

Failure to observe similar balanced accuracy improvements in a new external paired cohort with the same cross-validation protocol would indicate the method does not transfer.

Figures

Figures reproduced from arXiv: 2603.29167 by Bo Ma, Hongjiang Wei, Jinsong Wu, Kun Liu, Weiqi Yan.

**Figure 1.** Figure 1: Overview of the executable pilot scaffold evaluated in this study. The CT teacher path is active only during training, the X-ray [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Feasibility-only fixed-split summary on the paired X-ray target cohort. Bars show repeated-run means, and overlaid points [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Primary same-case evidence across eight patient-level Monte Carlo resamples on the paired cohort. Each point denotes one [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Cross-modality distillation ablation on the paired X-ray target cohort. The near-flat response surface indicates that the current [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Module ablations for the cross-modality pipeline. Bars show repeated-run means, and overlaid points show per-seed outcomes. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

We study a systems-level visual inference problem: using an expensive privileged modality during training while preserving a fixed-cost, single-modality deployment path. We present JDCNet, a confidence-gated CT-to-X-ray distillation framework in which the CT teacher supplies an auxiliary hard or temperature-scaled target only on training samples whose teacher confidence exceeds a threshold; at deployment the student takes X-ray input alone and matches the parameter, MAC, and latency profile of the supervised X-ray baseline. On a 510-patient same-patient paired BIMCV cohort with patient-level 5-fold cross-validation, two JDCNet configurations clear a fixed transfer gate against the supervised ResNet-18 baseline: 3-slice soft-KL supervision yields $\Delta\mathrm{BA}{=}{+}0.035$ ($95\%$ CI $[{+}0.011,{+}0.057]$) and mid-slice hard supervision yields $+0.033$ ($[{+}0.007,{+}0.058]$). Under the same splits and gate, logit distillation, gated logit distillation, contrastive alignment, attention transfer, feature hints, BiomedCLIP fine-tuning, and a module-augmented variant do not pass. Confidence-gated auxiliary targets are therefore a more transferable channel than uniformly softened CT logits; the evidence is bounded to one paired cohort, so external paired-cohort replication is required before any deployment claim.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

JDCNet shows modest but CI-backed gains from gating CT distillation to high-confidence samples only, and this beats several standard methods on their single paired cohort.

read the letter

The main point is that selectively passing CT teacher targets only when the model is confident produces small balanced-accuracy lifts over a plain X-ray baseline, while uniform logit softening and several other distillation tricks do not clear the same bar on this data. The 510-patient BIMCV paired set with patient-level 5-fold CV and 95% CIs gives the result some credibility, and the authors correctly flag that external replication is still needed. What is new here is the explicit confidence gate as a distinct mechanism rather than another variant of temperature scaling or feature alignment. They keep the student strictly X-ray at test time, so the deployment cost stays fixed. The head-to-head against logit distillation, contrastive alignment, attention transfer, and BiomedCLIP fine-tuning is useful because it shows the gating is not just another knob that happens to work. The reporting of intervals and the same-patient pairing are clear strengths; they make the deltas easier to interpret than many distillation papers. The soft spots are straightforward. Everything sits on one cohort, the absolute gains are modest (roughly 0.03–0.035), and the threshold and temperature choices are free parameters whose sensitivity is not fully visible from the abstract. If the gate was tuned with knowledge of the test folds, some optimism could creep in, though the CIs mitigate that concern. No code or full hyperparameter table is mentioned, which is common but still leaves room for hidden tuning. This is for groups working on cost-constrained medical imaging pipelines where CT is available at training but not at deployment. Readers who care about distillation variants in CV will get concrete comparison data. It has enough structure and statistical grounding to deserve a serious referee rather than a desk reject; reviewers will naturally focus on generalizability and ask for ablations on the gate itself. I would send it to review.

Referee Report

1 major / 3 minor

Summary. The manuscript presents JDCNet, a confidence-gated distillation framework that uses CT as a privileged teacher modality during training to improve X-ray-based inference at deployment. The student network receives auxiliary hard or temperature-scaled targets from the CT teacher only on samples where teacher confidence exceeds a fixed threshold; at test time the model operates on X-ray input alone and matches the parameter count, MACs, and latency of a standard supervised ResNet-18 baseline. On a 510-patient same-patient paired BIMCV cohort evaluated with patient-level 5-fold cross-validation, two JDCNet variants (3-slice soft-KL supervision and mid-slice hard supervision) produce balanced-accuracy gains of +0.035 (95% CI [+0.011, +0.057]) and +0.033 ([+0.007, +0.058]) over the supervised baseline, while logit distillation, contrastive alignment, attention transfer, and several other baselines do not exceed the same fixed transfer gate. The authors explicitly bound the result to this single cohort and call for external paired-cohort replication.

Significance. If the reported gains are reproducible, the work supplies a concrete, deployment-cost-preserving route for exploiting richer but expensive modalities (CT) during training of cheaper single-modality (X-ray) models. The key technical contribution is the demonstration that confidence gating yields more transferable auxiliary targets than uniform softening or feature-level alignment methods under identical splits and gate. The use of patient-level 5-fold CV together with 95% confidence intervals that exclude zero provides a transparent empirical foundation; the explicit caveat that external replication is required is appropriately cautious.

major comments (1)

[§4.2 and Table 2] §4.2 and Table 2: The fixed transfer gate (confidence threshold) is applied uniformly across all methods, yet the manuscript does not report whether the threshold value itself was chosen on a held-out validation fold or on the full training set; if the latter, the reported deltas for the two passing configurations may be optimistically biased relative to the non-passing baselines.

minor comments (3)

[Abstract and §3.1] Abstract and §3.1: The precise numerical value of the confidence threshold used for the fixed transfer gate is not stated; providing it would allow exact reproduction of the gating condition.
[§4.1] §4.1: Hyperparameter details for temperature scaling, learning-rate schedules, and the exact ResNet-18 backbone variant are referenced only by citation; a short table or appendix listing the final values used would improve reproducibility.
[Figure 3] Figure 3: The caption does not indicate whether the displayed confidence histograms are computed on training or validation folds; clarifying this would help readers interpret the gating behavior.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the need for greater transparency regarding threshold selection. We address the comment below.

read point-by-point responses

Referee: [§4.2 and Table 2] §4.2 and Table 2: The fixed transfer gate (confidence threshold) is applied uniformly across all methods, yet the manuscript does not report whether the threshold value itself was chosen on a held-out validation fold or on the full training set; if the latter, the reported deltas for the two passing configurations may be optimistically biased relative to the non-passing baselines.

Authors: We agree that the manuscript should explicitly document how the fixed confidence threshold was determined. In the original experiments the threshold was selected via a small grid search performed on a held-out validation portion of the training data within each patient-level fold (approximately 10 % of the training patients per fold), with the final value then frozen and applied uniformly to all methods and to the test fold. This procedure avoids test-set leakage while still using only training data. To make the process fully transparent we will add a paragraph in §4.2 describing the validation-based selection and will revise the caption of Table 2 to state that “the threshold was tuned on an internal validation split of the training folds and then held fixed across all compared methods.” These changes remove any ambiguity about optimistic bias. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical comparison on held-out folds

full rationale

The manuscript reports balanced-accuracy deltas from patient-level 5-fold cross-validation on a single 510-patient paired BIMCV cohort, with explicit bounds on generalizability and a call for external replication. No equations, first-principles derivations, or predictions are presented that reduce to fitted parameters or self-citations by construction. All reported gains are direct statistical comparisons against multiple baselines under identical splits and a fixed transfer gate; the protocol is externally falsifiable and does not rely on any load-bearing self-citation or ansatz smuggling. This is the most common honest non-finding for an empirical systems paper.

Axiom & Free-Parameter Ledger

2 free parameters · 0 axioms · 0 invented entities

Abstract-only review limits visibility into exact hyper-parameters; the method implicitly relies on a chosen confidence threshold and temperature scaling whose values are not stated.

free parameters (2)

confidence threshold
Determines which training samples receive the auxiliary CT target; value not reported in abstract.
temperature scaling factor
Used in soft-KL supervision variant; value not reported in abstract.

pith-pipeline@v0.9.0 · 5795 in / 1239 out tokens · 65641 ms · 2026-05-21T10:55:09.594634+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 4 internal anchors

[1]

XCOVNet: Chest X-ray image classification for COVID-19 early detection using convolutional neural networks,

V . Madaan, A. Roy, C. Gupta, P. Agrawal, A. Sharma, C. Bologa, and R. Prodan, “XCOVNet: Chest X-ray image classification for COVID-19 early detection using convolutional neural networks,”New Generation Computing, vol. 39, no. 3, pp. 583–597, 2021

work page 2021
[2]

COVID-ViT: Classification of COVID- 19 from CT chest images based on vision transformer models,

X. Gao, Y . Qian, and A. Gao, “COVID-ViT: Classification of COVID- 19 from CT chest images based on vision transformer models,”arXiv preprint arXiv:2107.01682, 2021

work page arXiv 2021
[3]

COVID-19 CT image recog- nition algorithm based on transformer and CNN,

X. Fan, X. Feng, Y . Dong, and H. Hou, “COVID-19 CT image recog- nition algorithm based on transformer and CNN,”Displays, vol. 73, p. 102150, 2022

work page 2022
[4]

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

J. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul, C. Langlotz, K. Shpanskaya, M. P. Lungren, and A. Y . Ng, “CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning,”arXiv preprint arXiv:1711.05225, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[5]

A survey on deep learning in medical image analysis,

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoo- rian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. S ´anchez, “A survey on deep learning in medical image analysis,”Medical Image Analysis, vol. 42, pp. 60–88, 2017

work page 2017
[6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16 x 16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[7]

Vision transformer for classification of breast ultrasound images,

B. Gheflati and H. Rivaz, “Vision transformer for classification of breast ultrasound images,”arXiv preprint arXiv:2110.14731, 2021

work page arXiv 2021
[8]

Distilling the Knowledge in a Neural Network

G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[9]

FitNets: Hints for Thin Deep Nets

A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y . Ben- gio, “FitNets: Hints for thin deep nets,”arXiv preprint arXiv:1412.6550, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[10]

Utilizing knowledge distillation in deep learning for classification of chest X-ray abnormalities,

T. K. K. Ho and J. Gwak, “Utilizing knowledge distillation in deep learning for classification of chest X-ray abnormalities,”IEEE Access, vol. 8, pp. 160 749–160 761, 2020. 10

work page 2020
[11]

Soft-label anonymous gastric X-ray image distillation,

G. Li, R. Togo, T. Ogawa, and M. Haseyama, “Soft-label anonymous gastric X-ray image distillation,” in2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 305–309

work page 2020
[12]

Variational knowledge distillation for disease classification in chest X-rays,

T. van Sonsbeek, X. Zhen, M. Worring, and L. Shao, “Variational knowledge distillation for disease classification in chest X-rays,” in Information Processing in Medical Imaging, 2021, pp. 334–345

work page 2021
[13]

Self-supervised learning with adaptive distillation for hyperspectral image classification,

J. Yue, L. Fang, H. Rahmani, and P. Ghamisi, “Self-supervised learning with adaptive distillation for hyperspectral image classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2021

work page 2021
[14]

A new learning paradigm: Learning using privileged information,

V . Vapnik and A. Vashist, “A new learning paradigm: Learning using privileged information,”Neural Networks, vol. 22, no. 5–6, pp. 544–557, 2009

work page 2009
[15]

Unifying distillation and privileged information,

D. Lopez-Paz, L. Bottou, B. Sch ¨olkopf, and V . Vapnik, “Unifying distillation and privileged information,” inInternational Conference on Learning Representations (ICLR), 2016. [Online]. Available: http://leon.bottou.org/papers/lopez-paz-2016

work page 2016
[16]

Learning with side information through modality hallucination for action recognition,

J. Hoffman, S. Gupta, and T. Darrell, “Learning with side information through modality hallucination for action recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 826–834

work page 2016
[17]

Cross modal distillation for super- vision transfer,

S. Gupta, J. Hoffman, and J. Malik, “Cross modal distillation for super- vision transfer,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2827–2836

work page 2016
[18]

Paying more attention to attention: Improving the performance of convolutional neural networks via atten- tion transfer,

S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via atten- tion transfer,” inInternational Conference on Learning Representations (ICLR), 2017

work page 2017
[19]

Key challenges for delivering clinical impact with artificial intelligence,

C. J. Kelly, A. Karthikesalingam, M. Suleyman, G. Corrado, and D. King, “Key challenges for delivering clinical impact with artificial intelligence,” BMC Medicine, vol. 17, no. 1, p. 195, 2019

work page 2019
[20]

Prediction models for diagnosis and prognosis of COVID-19 infection: Systematic review and critical appraisal,

L. Wynants, B. van Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, E. Albu, B. Arshi, V . Bellou, M. M. J. Bontenet al., “Prediction models for diagnosis and prognosis of COVID-19 infection: Systematic review and critical appraisal,”BMJ, vol. 369, p. m1328, 2020

work page 2020
[21]

Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans,

M. Roberts, D. Driggs, M. Thorpe, J. Gilbey, M. Yeung, S. Ursprung, A. I. Aviles-Rivero, C. Etmann, C. McCague, L. Beer, J. R. Weir- McCall, Z. Teng, E. Gkrania-Klotsas, J. H. F. Rudd, E. Sala, C.-B. Sch¨onliebet al., “Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT sca...

work page 2021
[22]

Why rankings of biomedical image analysis competitions should be interpreted with care,

L. Maier-Hein, M. Eisenmann, A. Reinke, S. Onogur, M. Stankovic, P. Scholz, T. Arbel, H. Bogunovic, A. P. Bradley, A. Carass, C. Feldmann, A. F. Frangi, P. M. Full, B. van Ginneken, A. Hanbury, K. Honauer, M. Kozubek, B. A. Landman, K. H. Maier-Hein, H. M ¨ulleret al., “Why rankings of biomedical image analysis competitions should be interpreted with care...

work page 2018
[23]

Machine learning for medical imag- ing: Methodological failures and recommendations for the future,

G. Varoquaux and V . Cheplygina, “Machine learning for medical imag- ing: Methodological failures and recommendations for the future,”npj Digital Medicine, vol. 5, p. 48, 2022

work page 2022
[24]

Covid-19 image data collection,

J. P. Cohen, P. Morrison, and L. Dao, “COVID-19 image data collection,”arXiv preprint arXiv:2003.11597, 2020. [Online]. Available: https://github.com/ieee8023/covid-chestxray-dataset

work page arXiv 2003
[25]

COVID-19 image data collection: Prospective predictions are the future,

J. P. Cohen, P. Morrison, L. Dao, K. Roth, T. Q. Duong, and M. Ghassemi, “COVID-19 image data collection: Prospective predictions are the future,”arXiv preprint arXiv:2006.11988, 2020. [Online]. Available: https://github.com/ieee8023/covid-chestxray-dataset

work page arXiv 2006

[1] [1]

XCOVNet: Chest X-ray image classification for COVID-19 early detection using convolutional neural networks,

V . Madaan, A. Roy, C. Gupta, P. Agrawal, A. Sharma, C. Bologa, and R. Prodan, “XCOVNet: Chest X-ray image classification for COVID-19 early detection using convolutional neural networks,”New Generation Computing, vol. 39, no. 3, pp. 583–597, 2021

work page 2021

[2] [2]

COVID-ViT: Classification of COVID- 19 from CT chest images based on vision transformer models,

X. Gao, Y . Qian, and A. Gao, “COVID-ViT: Classification of COVID- 19 from CT chest images based on vision transformer models,”arXiv preprint arXiv:2107.01682, 2021

work page arXiv 2021

[3] [3]

COVID-19 CT image recog- nition algorithm based on transformer and CNN,

X. Fan, X. Feng, Y . Dong, and H. Hou, “COVID-19 CT image recog- nition algorithm based on transformer and CNN,”Displays, vol. 73, p. 102150, 2022

work page 2022

[4] [4]

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

J. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul, C. Langlotz, K. Shpanskaya, M. P. Lungren, and A. Y . Ng, “CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning,”arXiv preprint arXiv:1711.05225, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[5] [5]

A survey on deep learning in medical image analysis,

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoo- rian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. S ´anchez, “A survey on deep learning in medical image analysis,”Medical Image Analysis, vol. 42, pp. 60–88, 2017

work page 2017

[6] [6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16 x 16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[7] [7]

Vision transformer for classification of breast ultrasound images,

B. Gheflati and H. Rivaz, “Vision transformer for classification of breast ultrasound images,”arXiv preprint arXiv:2110.14731, 2021

work page arXiv 2021

[8] [8]

Distilling the Knowledge in a Neural Network

G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[9] [9]

FitNets: Hints for Thin Deep Nets

A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y . Ben- gio, “FitNets: Hints for thin deep nets,”arXiv preprint arXiv:1412.6550, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[10] [10]

Utilizing knowledge distillation in deep learning for classification of chest X-ray abnormalities,

T. K. K. Ho and J. Gwak, “Utilizing knowledge distillation in deep learning for classification of chest X-ray abnormalities,”IEEE Access, vol. 8, pp. 160 749–160 761, 2020. 10

work page 2020

[11] [11]

Soft-label anonymous gastric X-ray image distillation,

G. Li, R. Togo, T. Ogawa, and M. Haseyama, “Soft-label anonymous gastric X-ray image distillation,” in2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 305–309

work page 2020

[12] [12]

Variational knowledge distillation for disease classification in chest X-rays,

T. van Sonsbeek, X. Zhen, M. Worring, and L. Shao, “Variational knowledge distillation for disease classification in chest X-rays,” in Information Processing in Medical Imaging, 2021, pp. 334–345

work page 2021

[13] [13]

Self-supervised learning with adaptive distillation for hyperspectral image classification,

J. Yue, L. Fang, H. Rahmani, and P. Ghamisi, “Self-supervised learning with adaptive distillation for hyperspectral image classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2021

work page 2021

[14] [14]

A new learning paradigm: Learning using privileged information,

V . Vapnik and A. Vashist, “A new learning paradigm: Learning using privileged information,”Neural Networks, vol. 22, no. 5–6, pp. 544–557, 2009

work page 2009

[15] [15]

Unifying distillation and privileged information,

D. Lopez-Paz, L. Bottou, B. Sch ¨olkopf, and V . Vapnik, “Unifying distillation and privileged information,” inInternational Conference on Learning Representations (ICLR), 2016. [Online]. Available: http://leon.bottou.org/papers/lopez-paz-2016

work page 2016

[16] [16]

Learning with side information through modality hallucination for action recognition,

J. Hoffman, S. Gupta, and T. Darrell, “Learning with side information through modality hallucination for action recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 826–834

work page 2016

[17] [17]

Cross modal distillation for super- vision transfer,

S. Gupta, J. Hoffman, and J. Malik, “Cross modal distillation for super- vision transfer,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2827–2836

work page 2016

[18] [18]

Paying more attention to attention: Improving the performance of convolutional neural networks via atten- tion transfer,

S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via atten- tion transfer,” inInternational Conference on Learning Representations (ICLR), 2017

work page 2017

[19] [19]

Key challenges for delivering clinical impact with artificial intelligence,

C. J. Kelly, A. Karthikesalingam, M. Suleyman, G. Corrado, and D. King, “Key challenges for delivering clinical impact with artificial intelligence,” BMC Medicine, vol. 17, no. 1, p. 195, 2019

work page 2019

[20] [20]

Prediction models for diagnosis and prognosis of COVID-19 infection: Systematic review and critical appraisal,

L. Wynants, B. van Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, E. Albu, B. Arshi, V . Bellou, M. M. J. Bontenet al., “Prediction models for diagnosis and prognosis of COVID-19 infection: Systematic review and critical appraisal,”BMJ, vol. 369, p. m1328, 2020

work page 2020

[21] [21]

Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans,

M. Roberts, D. Driggs, M. Thorpe, J. Gilbey, M. Yeung, S. Ursprung, A. I. Aviles-Rivero, C. Etmann, C. McCague, L. Beer, J. R. Weir- McCall, Z. Teng, E. Gkrania-Klotsas, J. H. F. Rudd, E. Sala, C.-B. Sch¨onliebet al., “Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT sca...

work page 2021

[22] [22]

Why rankings of biomedical image analysis competitions should be interpreted with care,

L. Maier-Hein, M. Eisenmann, A. Reinke, S. Onogur, M. Stankovic, P. Scholz, T. Arbel, H. Bogunovic, A. P. Bradley, A. Carass, C. Feldmann, A. F. Frangi, P. M. Full, B. van Ginneken, A. Hanbury, K. Honauer, M. Kozubek, B. A. Landman, K. H. Maier-Hein, H. M ¨ulleret al., “Why rankings of biomedical image analysis competitions should be interpreted with care...

work page 2018

[23] [23]

Machine learning for medical imag- ing: Methodological failures and recommendations for the future,

G. Varoquaux and V . Cheplygina, “Machine learning for medical imag- ing: Methodological failures and recommendations for the future,”npj Digital Medicine, vol. 5, p. 48, 2022

work page 2022

[24] [24]

Covid-19 image data collection,

J. P. Cohen, P. Morrison, and L. Dao, “COVID-19 image data collection,”arXiv preprint arXiv:2003.11597, 2020. [Online]. Available: https://github.com/ieee8023/covid-chestxray-dataset

work page arXiv 2003

[25] [25]

COVID-19 image data collection: Prospective predictions are the future,

J. P. Cohen, P. Morrison, L. Dao, K. Roth, T. Q. Duong, and M. Ghassemi, “COVID-19 image data collection: Prospective predictions are the future,”arXiv preprint arXiv:2006.11988, 2020. [Online]. Available: https://github.com/ieee8023/covid-chestxray-dataset

work page arXiv 2006